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EDITORIAL 


This issue of the Journal of Experi- 
mental Psychology marks the end of 
144 issues and 12 years of my editor 
ship. During these 12 years there 
have been marked changes in experi- 
mental chiefly in the 
directions of more mathematical for- 


psychology 


mulation and testing of theory, more 
elaborate 

perimental 
analyses, more emphasis on the human 


(and more adequate) ex- 

designs and _ statistical 
subject, and more emphasis on the 
“higher mental of the 
human subject—whether in learning 


and problem solving or in 


processes’”’ 


“informa- 
The 
Journal has attempted to respond to 
and reflect these changes, even en- 
courage them, at the same time that 
it remained a sympathetic medium for 
more traditional problems and meth- 
ods and for the wide range of content 
properly described as the experimental 


tion processing’’ performance. 


analysis of the mental and behavioral 
processes of the individual organism 
especially man. 

Provided that the problem of an 
experimental fit within the 
content boundaries prescribed for the 
Journal, the criterion for acceptance 
of an article has been at all times the 
question whether it warranted space 
in the ever-more-crowded archives of 
This 


multidimensional criterion, some di- 


study 


our science. is, of course, a 


5 


mensions being quite objective and 
some being quite subjective. Objec- 
tive, or at least rational, dimensions 
were matters pertaining to the ade- 
quacy of the experimental design for 
the collection of data on the problem 
as stated, the adequacy and appro- 
priateness of the measures extracted 
from the data and the statistical tests 
employed, and the logical relationship 
between the data exhibited and the 
conclusions drawn. Criticism and 
rejection on the basis of these char- 
acteristics of the experiment may be 
considered as the application of ‘‘in- 
ternal’ criteria, since the emphasis is 
on Here the 
question was usually formulated as 
“Is this a valid experiment?"” Many 
times the answer has been “No.” 
Proper control groups were not tested ; 
the design confounded variables that 
made the results and 
trivial, rather than important; the 
chosen method of summarizing the 
data led to conclusions that did not 
hold up if the data were summarized 
in another equally appropriate, or 
more appropriate, way; etc. 

The next step in the assessment ol 
an article involved a judgment with 
respect to the confidence to be placed 
in the findings—confidence that the 
results of the experiment would be 
repeatable under the conditions de- 


internal consistency. 


conclusions 


<2 
+3 
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scribed. In editing the Journal there 
has been a strong reluctance to accept 
and publish results related to the 
principal concern of the research when 
those results were significant at the .05 
level, whether by one- or two-tailed 
test! This has not implied a slavish 
worship of the .01 level or any other 
level, as some critics may have 
implied. Rather, it reflects a belief 
that it is the responsibility of the 
investigator in a science to reveal his 
effect in such a way that no reasonable 
man would be in a position to dis- 
credit the results by saying that they 
were the product of the way the ball 
bounced. At least, it was believed 
that such findings do not deserve a 
place in an archival journal, even 
though they may be proper fare for 
symposia, scientific meetings, and 
dittoed handouts. The P level of a 


finding which was the major purpose 
of the investigation (anyone can find 
a significant practice effect) is only 


one element in the persuasion, others 
being the relation of necessity be- 
tween the predicted relationship and 
other previously or concurrently de- 
monstrated effects, and the consist- 
ency of the relationship across a 
sequence of experiments. But an 
isolated finding, especially when em- 
bodied in a 2 X 2 design, at the .05 
level or even the .01 level was 
frequently judged not sufficiently 
impressive to warrant archival publi- 
cation. The same philosophy applied 
when negative results were submitted 
for publication, but here rejection 
frequently followed the decision that 
the investigator had not given the 
data an opportunity to disprove the 
null hypothesis, i.e., the sensitivity of 
the experiment was substandard for 
the type of investigation in question 
and was therefore not sufficient to 
persuade an expert in the area that 
the variable in question did not have 


EDITORIAL 


an effect as great as other variables of 
known significant effect. 

Even if a proffered experimental 
study passed the hurdle of design 
adequacy and judged repeatability, 
it still needed to pass another hurdle, 
and an increasing number of articles 
failed this third hurdle as we moved 
into the late 1950s and early 1960s. 
Increasingly, we applied a criterion of 
substantiality to experimental studies. 
By this was meant that the investiga- 
tion should not merely identify the 
effect of a variable, but should move 
beyond that simple demonstration to 
either the determination of a function 
relating levels of the variable to levels 
of the effect or the assembly of further 
information about the demonstrated 
effect—the composition of the vari- 
able, the range of tasks in which it had 
an effect, the variation of the effects 
as a function of a second or third 
variable, etc. We believed that the 
day of the archival report based on a 
simple experiment with an _ experi- 
mental and control group or with a 
2X2 design was past in many 
mature areas of psychological re- 
search, and that each published report 
should make a more substantial 
contribution to the problem. In 
particular, it seemed desirable for 
experimental psychology to move 
toward the determination of quantita- 
tive functional relationships between 
independent and dependent variables, 
especially since so many of these 
quantitative relationships in behavior 
turn out tobe nonmonotonic. Failure 
to make a serious effort to understand 
the variable, either through plotting 
the effects of several levels of it or 
through foilow-up experiments. with 
the intent of determining the gen- 
erality or other contingencies of its 
effect, was considered sufficient reason 
to choose not to publish until such 
additional work had been done. 





EDITORIAL 


These, then, have been the guiding 
criteria for acceptance of articles for 
the Journal. When an article 
rejected, an attempt was made to 
state the basis for rejection in terms 
of those criteria. But the same 
criteria were employed as the basis 
for required revisions prior to publica- 
tion. Only about 10% of all articles 
received by the Journal were pub- 
lished without substantial revision or 
rejection, which is to say that four out 
of five of the published articles (which 
were, in turn, 50% of those received) 
have suffered substantial revision be- 
fore publication. Very often the revi- 
sion was required for the purpose of 
condensing the article, eliminating du- 
plicated data in figures and tables, and 
in other ways decreasing the length of 
an article without 
sential content. 
that would 
required revisions have consisted of 
adding detailed description of pro- 
cedures, making explicit some design 
factor, adding data in tables or 
figures, and even urgings to the 
author to add words in order to make 
more explicit his analysis and inter- 
pretations. In short, the philosophy 
of acceptance has been that an article 
should not be rejected if the experi- 


was 


reducing its es- 
But with a frequency 
surprise some readers, 


ments were acceptable, even though 
major revision of the data analysis or 
the article as a whole was required. 
In fairness to contributors who sub- 
mitted completely acceptable articles, 
those who were required to revise 
were given a limited period of time 
(usually 30 days) in which to re- 
submit the revision without 
position in the publication order. 
The intent of criteria as 
applied either to acceptance or re- 
vision before publication has been to 
get more scientific mileage from the 
pages of the Journal. 


loss of 


these 


This was done 
by excluding questionable data, by 


encouraging the reporting of research 
in larger, more substantial chunks, 
and by reporting research as com- 
pletely as necessary but also 
cinctly. All of this stems from the 
conception of the Journal as an 
archive of our science, not as a news- 
sheet filled with a heavy load of 
transient, undigested, or fallible in- 
formation. However, this screening 
of information for such archival 
records is not an infallible procedure. 
The criteria are not reducible to 
formula, and the final judgment is 
intrinsically subjective. Therefore, 
studies have been accepted and pub- 
lished, only later to be judged inade- 
quate by the criteria; others have been 
rejected and published elsewhere, later 
to be widely acclaimed as containing 
important data. (I do not include in 
the latter class those experimental 
articles that deserved to be rejected 
because of grievous faults in the 
design, but which, when published 
elsewhere, stimulated research on a 
problem owing to the very inade- 
quacies of the original experiment.) 


suc- 


It should be clear from what has 
been said about criteria that heavy 
demands were placed on the editorial 
staff for detailed information about 
what is already known and _ for 
methodological sophistication, and 
this demand applied to a myriad of 
special problem areas in the case of an 
omnibus journal such as the Journal 
of Experimental Psychology. The 
heart of the editorial system is now 
and has been the board of Consulting 
Editors of the Journal, some of whom 
served for the entire 12 years and all 
of whom have made multiple, essential 
contributions to the implementation 
of the criteria and standards that all 
of us considered the 


It is fitting, 


to be in best 


interest of the science. 


therefore, that all of the Consulting 





556 


Editors of the Journal be given this 
public repetition of my private ex- 
pressions of appreciation and _ in- 


Norman H. Anderson (1959-62) 
E. James Archer (1957-62) 
Fred Attneave, ill (1959-62) 
Judson S. Brown (1951-58) 
Cletus J. Burke (1953-62) 
James Deese (1960-62) 

Paul M. Fitts (1957-62) 
Frederick C. Frick (1957-59) 
Robert M. Gagné (1953-56) 
Wendell R. Garner (1954-56) 
Frank A. Geldard (1951-62) 
James J. Gibson (1951-62) 
Clarence H. Graham (1951-61) 
David A. Grant (1951-56) 


Delos D. Wickens (1951 


In addition to those listed, many 
other psychologists have made im- 
portant contributions to the Journal 
through their reviews of specific 
articles where their competence was 
deemed necessary for knowledgeable 
evaluations. Each of them, if he 
reads this, will, | hope, consider him- 
self again privately thanked for his 
contribution. 

This tribute to the Consulting 
Editors of the Journal must not be 
interpreted as shifting to their should- 
ers responsibility for the Type I and 
Type Il errors we have made in 
accepting or rejecting articles. Their 
relationship to the final decision, 
which was always made by myself or 
by an Associate Editor, was under- 
stood at all times to be advisory. 
There were times—not very many 
when Consulting Editors did not 
agree in their evaluation, and there 
were times—again, not very many 
when we accepted even though the 
Consulting Editor recommended re- 
jection, or rejected even though he 
recommended acceptance. In any 
event, he was informed of the action 
taken on his advice, since he received 
copies of the letters to authors which 


EDITORIAL 


debtedness for making my editorship 
of the Journal possible. The list 
follows: 


Harold W. Hake (1957-62) 
Lloyd G. Humphreys (1951-59) 
Arthur L. Irion (1954-62) 
Howard H. Kendler (1957-62) 
Herschel W. Leibowitz (1962) 
Donald B. Lindsley (1951-62) 
Kenneth MacCorquodale (1957-62) 
Quinn McNemar (1957-62) 
Neal E. Miller (1951-57) 
Edwin B. Newman (1951-54) 
Leo Postman (1961-62) 
L. Starling Reid (1958-62) 
Kenneth W. Spence (1953-62) 
Benton J. Underwood (1951-56) 
56, 1959-62) 


indicated acceptance, rejection, or 
requirements for revision. Perhaps 
we should take this opportunity to 
thank all those Consulting Editors 
who tolerated our bad judgment when 
we failed to follow their advice, and 
who did not resign forthwith 
did, to my knowledge. 
Penultimately, | wish to express my 
deep appreciation to the three who 
served as Associate Editors of the 
Journal—David A. Grant (1957-62), 
Delos D. Wickens (1957-58), and 
William K. Estes (1959-62). Editing 
the Journal during these last 6 years 
would have been intolerable, if not 
impossible, without each of them 
assuming roughly one-third of the 
responsibility for deciding what should 
and should not be in the Journal. As 
many who have contributed to the 
Journal know, their roles were those 


none 


of Co-Editors, with complete re- 


sponsibility for conducting the rela- 
tionship with authors up through the 
decision to The 


accept or reject. 


ability of the Journal to judge ap- 


propriately some of the technical and 
theoretical innovations of recent years 
is largely a 


consequence of their 
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participation in the editing of the 


Journal. 

Finally, it would be thoughtless of 
me to bring this swan song to a close 
without expressing our appreciation 


of the tolerance (sometimes requiring 


incubation) that the vast majority of 
contributors to the Journal have 
shown toward the editorial mayhem 
performed, sometimes repeatedly, on 
the products of their thought and 
work. Experimental work in psy- 
chology is terribly laborious business, 
as every one who has done it knows. 
When, after all is done and a report of 
it is written, it is nothing short of 
mental cruelty to have an editor 
require that it be cut in half, and even 
worse to have him recommend that it 
serve as a lesson in how to do a better 
job next While | 
illusion editorial 


time. have no 


that this role has 


increased the quantity of warm senti 
ments that come my way, and | know 
that | have been hung in effigy in 
some laboratories and offices, it is still 
my hope that producing experimental 
psychologists who make our science 
grow apace, and who make the Jour- 
nal possible, recognize the attempt to 
be fair and explicit, if not the wisdom, 
in the decisions they have suffered. 
Some authors have even been so kind 
as to say that this is so. 

I feel no reluctance, only gratifica- 
tion and confidence, as I relinquish 
the editorship to my able friend and 
colleague, David A. Grant. These 
sentiments relate not only to his 
editorship, but also to the vigorous, 
sometimes 
perimental 


state of ex- 
and experi- 


combative, 
psychology 
mental psychologists. 

ARTHUR W. MELTON 
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FREE RECALL LEARNING OF VISUAL FIGURES AS A 


FUNCTION OF FORM OF 


INTERNAL STRUCTURE 


JAMES R. WHITMAN 


Veterans Administration Hospital, Perry Point, Maryland 


AND W. 


R. GARNER 


Johns Hopkins University 


The literature on factors affecting 
free recall learning is voluminous. 
The kinds of factors which have been 
investigated involve such things as the 
number of items to be learned, prior 
experience with the items, meaning- 
fulness of the items, interitem simi- 
larity, etc. Most of the experiments 
reported have had as an assumption, 
either explicit or implicit, that char- 
acteristics of the individual items or 
stimuli are critical. 

Garner (1962) has emphasized an- 
other aspect of free recall learning, 
namely, the internal structure 
herent in groups stimuli 
learned. <A stimuli can be 
considered to be generated by a series 
of variables, and these variables can 
be interrelated in any subset of 
stimuli actually used in the learning 
experiment.' This relatedness con- 
stitutes the internal structure of the 
stimuli or the variables which make up 
the stimuli. The amount of internal 
structure is the same as the redun- 
dancy of the subset of stimuli and is 
determined by the number of stimuli 
used in an actual subset compared to 
the number which could have been 
generated with the same number (and 
levels) of the variables. 

If the number of stimuli used in an 
experiment is the same as the total 


in- 


of to be 


set of 


1 In this paper we shall use the term set to 
indicate the group of all possible stimuli 
generated by the specified variables and 
levels; we shall use the term subset for any 
group of stimuli which does not include all 
possible stimuli. 


number which 
from the given 


could be generated 
variables, then all 
variables are orthogonal to each other 
and there is no internal structure to be 
learned. Knowing these variables, S 
can reproduce the set of stimuli with- 
out practice. But if the number of 
stimuli used is less then the number 
which could have been used, then 
internal structure exists; and both the 
amount and the form of this structure 
will affect ease of free recall learning. 
The internal structure is not de- 
termined by the relations between 
elements of any particular stimulus. 
Rather, it is determined by the rela- 
tions between the variables making up 
the stimuli across the subset of 
stimuli actually used. Thus the 
amount and form of internal structure 
cannot be specified without knowing 
the exact subset of stimuli used in the 
learning experiment. 
considerations 
to 


These led Garner 
(1962) two specific hy- 
potheses, the tests of which are the 
purpose of this experiment. First, the 
ease of free recall learning is not a 
question of the characteristics of the 
individual stimuli which make up the 
subset but is a question of the char- 
acteristics the 
stimuli. Thus the same stimuli im- 
bedded in two different 


state 


of entire subset of 


subsets of 


stimuli will be learned according to the 


characteristics of the subset within 
which they are imbedded, and the 
nature of the unique stimuli is ir- 


1 
relevant. 





FREE 
the form of the internal 
structure is a critical factor in learning 
even with the same total amount of 
internal structure. Specifically, those 
forms of structure which involve 


Second, 


direct contingencies between pairs of 


variables will produce easier learning 
than will forms of structure involving 
complex relations among three or 
more variables, i.e., interactions. 

While these hypotheses are relevant 
to free recall learning of any stimuli, 
the present experiment them 
specifically with free recall learning of 
visual figures. 


tests 


METHOD 
The Stimuli 


In carrying out an experiment to test the 
importance of the form of internal structure 
in free recall learning, it is important that the 
amount of internal structure be held constant 
In more specific important to 
specify not only the subsets of stimuli 
actually used but also the total set of stimuli 


terms, it 1s 


which could have been used, since the ratio 


between determines the amount of 
internal structure. 

Total set of potential stimuli.—The po- 
tential stimuli, or the complete set of stimuli 


from which the subsets were selected, were 


these 


formed by using three levels or values of each 
of four variables. If all possible combinations 
total 
Che four 
variables and their values are: (a) Shape, with 


of these variables are generated, the 
number of possible stimuli is 81 
squares, triangles, or circles constituting the 
levels; (b) Lines, with two, one, or zero lines 
bisecting the shape; (c) Spaces, with a space 
on the left, on the right, or no space; and 
(d) Dots, with a dot above the shape, below it, 
or none. 

Three different 
subsets of stimuli to be used in the experiment 
were chosen from this total set. Each subset 
contained 9 different from the 81 
possible and differed only with regard to the 
form of the internal constraint 


Subsets of actual stimuli 


stimuli 


In selecting 
these three subsets, it is important that each 
subset demonstrate all four variables of the 
total set and, furthermore, that each 
of each variable occur equally often 


level 
rhis 
that the 
factor of total amount of internal structure 
not be confounded with the form of internal 
structure. All three subsets of actual stimuli 


precaution is necessary to ensure 


RECALL LEARNING OI 


VISUAL FIGURES 


SUBSET A SUBSET 


Fic. 1 rhe 


stimuli used 


three subsets of actual 
The number to the right of 
each figure provides the coded values for the 
four variables, in the order, shape, space, line, 
and dot. The underlined coded values for 
three figures each in Subsets B and ( 


sent the three identical 


repre- 
figures for these 


subsets. ) 


are shown in Fig. 1, along with coded values 
of the four variables 

Subset A: In the first subset of nine visual 
figures each of the four variables occurs three 
times at each level, but no two of the variables 
are directly correlated 
variables, there are six pairs of 


Since there are four 
variables: 
and none of these pairs has a contingency 
greater than zero. This subset of stimuli is 
equivalent to a Graeco-Latin square, in which 
four variables are all orthogonal to each other 

Subset B: 
selected so that one of the six pairs of vari 
ables was perfectly correlated and the other 
five were orthogonal 


The second set of figures was 


This subset prov cle sa 
condition intermediate between subsets with 
minimum 
tween variables. 

Subset C: The last figures was 
selected so that three of the six pairs of vari 
ables were perfectly correlated 
other three were uncorrelated 


and maximum contingencies be 


subset of 


while the 
rhis subset 
provides the maximum contingency between 
pairs which can exist 
stimuli are required, at 
variables must be restriction 
which also means that no more than three 
of the pairs can be correlated 


Since nine different 
least one pair ol 


orthogonal, a 
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rABLE 1 


UNCERTAINTY ANALYSES OF THE FORM Ot 
INTERNAL STRUCTURE OF THE THRE! 
SUBSETS OF STIMULI, WITH SHAPI 
(W), Space (X), Line (Y), AND 
Dor (Z) AS VARIABLES 


Subsets 


Contingencies 


Simple 
W:X 


Interaction 


WX) 


Three of the stimuli in Subset C are iden- 
tical to three in Subset B. These identical 
stimuli were included to allow comparison of 
learning rates for these particular stimuli, to 
determine the extent to which learning is 
affected by the particular stimuli rather than 
by the characteristics of the subset. 

To summarize the characteristics of these 
three subsets of stimuli, each subset contains 
nine different stimuli; in addition, each subset 
contains exactly three occasions of each of the 
three levels of the four variables 
Thus the number of specific elements which 
compose the different subsets is identical in 
all cases. 


each of 


These subsets differ only in regard to the 
form of the internal structure, and an 
certainty analysis (in bits) of 
stimuli is shown in Table 1. 
The total amount of internal structure 
(W:X:Y:Z), shown at the bottom, is the 
same in all three cases, 3.16 bits. In Subset A, 
however, all of this structure is in the form 
of interactions, and none of the simple con- 
tingencies 


un- 
these three 


subsets of 


(between pairs of variables) is 
greater than zero. In Subset B, one simple 
contingency exists, the rest of the 
structure is in the form of interactions. The 
pattern here is somewhat more complex. 
Again the interaction involving all four vari- 
ables is negative and serves to correct the 


In Subset C, the 


and 


three-variable interactions. 


AND W. R. GARNER 


maximum amount -of simple contingency 
exists since three of the pairs are perfectly cor- 
related. Since structure is 
greater than the total structure, again the 
negative interaction term occurs to correct the 
total. 

Our hypothesis concerning form of struc- 
ture concerns the amount of structure which 
exists in the form of simple contingencies. In 
Subset A, none does; in Subset B, 1.58 bits 
does; and in Subset C, 4.74 bits does 


this amount of 


Subjects 


All Ss were personnel associated with a 
large VA hospital and included 16 summer 
students and 9 staff members with professional 
degrees. They ranged in age from 15 to 58 
yr. There was a total of 39 Ss, and they were 
assigned randomly to each of three groups 
with the restriction that each group contain 
an equal number of professional staff and 
students, insofar as possible. (A median test 
showed no difference in performance between 
the different kinds of Ss.) Each group of Ss 
was required to learn just one of the three 
subsets of stimuli. 


Materials 


The drawn with 
black India ink on individual white paste- 
board sheets, 83 X 11 in. The diameter of 
the circles and the sides of both triangles and 
squares were 6 in. All spaces were 2 in. in 
width and were centered. Dots were solid, 
i in. in diameter, and centered } in. below or 
above an edge. 
in. wide. within 
centered, and when 
were } in. apart. 


stimulus figures were 


All lines were solid, about 
Lines the 
were 


1 
32 
patterns 

present 


were 


two they 


Learning Trials 


Che Ss were tested either individually or in 
small groups, depending on availability. The 
E stood in front of Ss and held each stimulus 
card from a subset for 5 sec., with one 
stimulus immediately following another. The 
order in which the stimuli were presented was 
predetermined so that no figure on any trial 
followed the same figure that it had on the 
preceding trial, and each figure was presented 
once as the first and once as the last in a 
series of nine trials. 

The Ss were told that they were partici- 
pating in an experiment to see how fast they 
could learn to reproduce from memory nine 
different diagrams or figures which would be 
shown to them. The E then described the 
four characteristics of the figures and the 
levels of each, giving illustrations by using 
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figures not later used in the experiment. The 
words “‘angles,”’ “‘spaces,”’ “‘lines,”’ and “‘dots”’ 
were then written on a blackboard as a 
reminder of the four characteristics, and these 
remained in view throughout the experiment. 

The Ss were then told that (a) the figures 
were going to be presented one at a time; 
b) the order in which they were presented 
we'd vary; and (c) after they had seen all 
on a given trial they were to draw the 
figures from memory in any order. For this 
purpose they were given answer sheets con- 
taining nine blank spaces in three rows of 
three each. The Ss were instructed to draw 
nine different figures each time, guessing if 
necessary. 

A trial consisted of the presentation of a 
complete subset of stimuli followed im- 
mediately by an answer period of 2 to 3 min. 
for Trial 1 and 1.5 to 2.5 min. for subsequent 
trials. At the end of 1.5 min. for 
Trial 1) S was urged to complete nine different 
reproductions. At the end of the 
period, S covered his answers and 
structed not to look at them again. 
Trials 1-5 E described each stiraulus in terms 
of the four variables as it was presented. 
Practice continued for 20 trials or until S had 
correctly reproduced all nine figures on a 
single trial. 


nine 


min. (2 


answer 
was in- 
During 


Measures 


Che reproductions of S were scored in the 
following ways: (a) number of trials in order 
to reproduce correctly all nine figures; (6) 
number of correct responses on each trial; and 
(c) the amount of simple contingency be- 
tween pairs of variables in the reproductions, 
without regard to correctness of response 
[his latter measure is simply a matter of 
determining the form of the internal structure 
in each set of nine reproductions in the same 
way that the stimuli themselves are described. 
In determining contingent uncertainties from 
the reproductions, an approximation pro- 
cedure was used to facilitate computation. 
The number of pair coincidences was counted, 
and the total of these translated 
contingent uncertainty by a computed 
graphical function. 


was into 


RESULTS 


Form of structure—The main re- 
sults pertain to the hypothesis con- 
cerning the effect of form of structure 
on free recall learning. Table 2 shows 
the number of trials required for a 
criterion of nine correct reproductions 
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for the three groups. Median trials 
were used, rather than means, because 
6 of the 13 Ss learning Subset A had 
not learned the stimuli to the criterion 
within the 20 trials allowed. Subset 
A had stimuli with no pair contin- 
gencies, and it is clear that such a 
subset of stimuli is very difficult to 
learn. By contrast, Subset C, with 
maximum simple contingencies, was 
extremely easy to learn, with a median 
of just two trials. In fact, 5 of the 13 
Ss learning Subset C correctly re- 
produced all nine stimuli on Trial 1. 

Analysis of the data in terms of 
number of correct reproductions per 
trial shows equally clearly the great 
difference between these subsets of 
stimuli. This analysis, shown in Fig. 
2, indicates how rapidly learning 
occurs with the high simple contin- 
gencies and how far from complete it 
is even after 20 trials with the zero 
contingencies (Subset A). The evi- 
dence in favor of the hypothesis could 
not be much clearer. 

Individual stimuli.—Three stimuli 
in Subset B were identical to three 
stimuli in Subset C. If the character- 
the individual stimuli are 
important in free recall learning, then 
these three stimuli should have been 
learned at the same rate regardless of 
the subset within which they were 
imbedded. Analysis of number of 
correct reproductions of just these 
three stimuli was carried out for each 


istics of 


TABLE 2 


NUMBER OF LEARNING TRIALS TO CRITERION 
FOR THE THREE SUBSETS OF STIMULI 


Subsets | NV Median amp Whitney 


Range 


9-20 + 19 





JAMES R. WHITMAN 


Fic. 2. Percentages of correct responses as 
a function of trial for the various subsets of 
figures. (The filled points are data for all 
nine figures of each subset. The open points 
are data from just those three figures in 
Subsets B and C which were identical. Each 
point is the mean for 13 Ss.) 


subset separately, and the data ob- 
tained are plotted in Fig. 2 as the open 
squares and triangles. Since these 
data are plotted in terms of per- 
centages of correct responses, direct 
comparisons of the learning curves are 
possible. 


The curves for the three particular 


stimuli follow almost exactly the 
learning curves for the subsets within 
which they were contained and bear 
little relation to each other. The 
evidence could not be much clearer 
that the characteristics of the in- 
dividual stimuli are of little relevance 
in free recall learning of subsets of 
stimuli but rather that the character- 
istics of the entire subset of stimuli are 
important. 

In fact, when it is recalled that one- 
third of the stimuli in Subsets B and 
C were identical, the large difference 
in learning rates for these two subsets 
Apparently 
what is learned is not the individual 
stimulus but a total set of relations 
between variables which make up the 
stimuli. 


is even more impressive. 


Internal structure in reproductions. 


These data show that subsets of 
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stimuli in which there are high simple 
contingencies are easy to learn. In 
order to provide some additional 
understanding of the role of this 
factor in free recall learning, the 
amount of simple contingencies (in 
bits) was determined for Ss’ repro- 
ductions on each successive trial; and 
these contingency results are shown 
in Fig. 3. There are several factors of 
interest in these curves. 

First, in order to reproduce cor- 
rectly all nine figures, the reproduc- 
tions must contain the same pattern 
of contingencies as the stimuli them- 
selves had. But this pattern is simply 
a prerequisite condition since it is 
possible to have the same total 
amount of simple contingencies but 
not to have the correct pairing of 
variables. Thus part of the learning 
process involves learning to reproduce 
the correct pattern of contingencies. 

Second, analysis of the contin- 
gencies in the reproductions can give 
us some idea of what seems natural to 
Ss, and the data in Fig. 3 clearly show 
that Ss produce a very high level of 


Fic. 3. Simple contingencies in reproduced 
subsets as a function of trial. (Each point is 
the average of the total amount of the simple 
contingencies in bits in the subsets as repro- 
duced by S without regard to correctness of 
the reproductions. For Subsets B and C, 
each plotted point is the average of the 13 Ss’ 
rhe Ss for Subset A are divided into 
seven learners and six nonlearners.) 


scores. 
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contingencies in their first few trials. 
This level, close to the maximum 
possible, is so high that there is 
actually very little learning for Subset 
C, in which the maximum possible 
contingencies are found. . But for the 
other two subsets, this level is much 
too high, and in effect Ss must learn 
to undo this apparently natural tend- 
ency to produce high contingencies 
before they can reproduce the stimuli 
correc tly. 
In order to obtain some idea of 
whether these high contingency values 
are simply the result of random pair- 
ings of produced 10 
random sets of stimuli, in which it was 
only required that each level of each 
variable occur equally often and that 
all 9 patterns be different. The mean 
contingency value for these 10 ran- 
domly selected patterns was 3.23 bits, 
with a standard error of .18 bits. This 
value is so far below those for early 
trials that it Ss do not 
just 


variables, we 


is clear that 
amounts ol 
simple contingencies but produce close 
to the maxinium possible value. And 
these high values are produced even 
when the stimulus subsets themselves 
contained much Ap- 


produce random 


lower values. 


parently these low values of simple 
contingency are contrary to Ss’ normal 
expectations. 

The data for Subset 
for Ss 
those who did not learn the figures 
within the 20 trials. 


A are plot ted 


separately who learned and 
The nonlearners 
show very little evidence of learning 
at all, and there is the strong sugges- 
tion that some Ss never would learn 
the figures. Actually, while the ex- 
periment was cut off at 20 trials, an 
attempt 


it for these nonlearners. 


had been made to continue 
Two of the 
six nonlearners were continued to 30 
trials and had not yet learned. Two 
others refused to continue the experi- 


ment shortly after 20 trials, because 


VISUAL FIGURES 563 
they felt they never would learn the 
figures. These stimuli are very diffi- 
cult indeed to learn, and there is the 
suggestion that some Ss cannot deal 
with or conceptualize completely un- 
correlated stimulus variables. 


DISCUSSION 


The results of this experiment leave 
little doubt that the 
relationships between variables within a 
(internal structure) is critical for 
free recall learning, and that the two hy- 
potheses initially stated are valid: Free 
recall learning is a function of the struc- 


context of inter- 


subset 


tural characteristics of the entire subset 
of stimuli, not of the individual stimuli; 
and internal structure which exists in the 
form of simple contingencies between 
variables is better for free recall learning 
than are more complex forms of structure. 
Each of these points deserves some 
comment, and we shall do so in reverse 
Miller (1958) that 
recall learning is easier for what he called 
gen- 
different 
the se- 


order. showed free 
redundant strings of letters. He 


erated nonsense words by 
which affected 


quential dependencies between successive 


statistical rules 
letters in the words and found that high 
sequential gave better 
learning. Since the lists which he com- 
pared, however, were of the same length, 
and since the number of different letters 
possible was the same for each list, it is 
clear that his experiment concerned not 
the amount of redundancy but rather its 
form. It is more difficult to state the 
amount of simple contingency in his lists 
were not of the 
length, but the nature of the differences 
was certainly similar to the differences 
used in the present experiment. 

In this experiment, the amount of 
redundancy was the same in all three sets 
of stimuli, but the amount itself should 
be an important variable for free recall 
learning. Horowitz (1961) compared 
lists of letter trigrams differing in simi- 
larity by Underwood's (1954) definition 
of similarity as the extent to which words 
3y this 
definition, low similarity is equivalent to 


dependencies 


since all words same 


on the list share the same items. 
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. 
high redundancy or internal structure 
and vice versa since low similarity lists 
have many different levels or values per 
variable, a fact which means that the set 
of potential stimulus words is very high 
compared to the number of words 
actually used. He found better free 
recall learning in early trials for the high 
similarity (low redundancy) lists. 

In generating his lists, Horowitz used, 
for the low redundancy lists, a form of 
redundancy in which pairs of variables 
(letter positions) were very nearly un- 
correlated. As the present results show, 
such a form is poor for free recall learn- 
ing. His high redundancy lists, on the 
other hand, had 12 different letters; and 
he used all of them in each of his three 
letter positions. Such a procedure means 
that each letter in each position is paired 
uniquely with a letter in each other 
position so that pair contingencies are 
necessarily high. With so many letters 
per position no other relation is possible. 
Yet the net effect is that Horowitz used 
a good form of structure with his high 
redundancy and a poor form for his low 
redundancy. It is almost certain that if 
Horowitz had used a good form of 
structure with his low redundancy lists, 
or even random pairings of letters, he 
would have obtained much larger differ- 
ences between his 
redundancy lists. 

There is another point that stems from 
Horowitz’ experiment that needs em- 
He showed that the rela- 

similarity and learning 
depended on the kind of learning re- 
quired. In the present context, it 
almost certain that the results we have 
obtained are true for free recall learning, 
but they will probably not be true for 
kinds of learning which 
crimination between 
(1962) 
detail. 


low and his high 


phasis here. 


tions between 


15 


involve dis- 


items. Garner 


has discussed this problem in 


Many other experiments, summarized 


by Garner (1962), have shown that 
discrimination processes depend on the 
total rather than the in 
dividual stimuli. Klemmer and Loftus 
(1958), for example, showed that identi- 


fication of 


set of stimuli 


numerals with brief visual 
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exposures depends on the total set of 
forms within which the numerals are 
imbedded. Our experiment shows that 
similar considerations hold for learning 
processes. 

It should be emphasized that the 
characteristics of a group of stimuli are 
not simply the sum of the characteristics 
of the individual stimuli but are char- 
acteristics which can exist and be 
specified only for the total subset. Thus 
what is learned is the entire subset. In 
actual fact, the problem must really be 
put in reverse: We cannot specify the 
characteristics of the individual stimulus 
until we know the characteristics of the 
entire subset since the nature of the re- 
quired differentiations depends on the 
alternative stimuli within the subset. 


SUMMARY 


This experiment tested two hypotheses 
relating free recall learning to the form of the 
internal structure: (a) the ease of free recall 
learning depends not on the characteristics of 
the individual stimuli but on the character- 
of the entire subset to be learned; 
(6) when a subset of stimuli is characterized 
by simple contingencies between pairs of 
variables generating the set, free recall learn- 
ing will be easier than when the subset is 
characterized by interactions involving three 
or more variables. 

Three different forms of internal structure 
in subsets of visual figures were compared. 
The results showed clear differences in the 
predicted direction and both hypotheses were 
substantiated. 


istics 
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It has been shown (Wilkinson, 
1958) that there are some tasks which 
most people can perform as well as 
normally under 30 hr. sleep depriva- 
tion and also that there are some 
individuals whose performance seems 
quite unaffected by the stress what- 
ever the task. Is lack of sleep com- 
pletely without effect in these situa- 
tions or is the effect appearing in some 
form which is not being measured ? 
It has been suggested (Wilkinson, 
1961) that motivational factors are 
important in deciding whether per- 
formance will be impaired; a man 
appears capable of performing nor- 
mally in spite of loss of sleep if the 
rewards for doing so or the penalties 


for failing to do so are sufficiently 


great. The present hypothesis is that 
this will only be done at the expense 
of extra effort and that electro- 
myographic (EMG) records of mus- 
cular responses may provide some 
indication of this. In this experiment, 
therefore, EMG has been measured 
concurrently with an assessment ol 
the effect of loss of sleep on per- 
lormance. 
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experiment has formed part of a paper in the 
“CIBA Symposium on the Nature of Sleep,” 
the proceedings of which have been published 


by Churchill, London. 
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METHOD 


Procedure.—Twelve Ss, enlisted men be- 
tween the ages of 18 and 30, carried out a 
20-min. test of addition twice at an interval 
of 2-4 days, once with sleep and once without. 
The design was balanced for practice effects 
and for the possibility of the two test papers 
being of unequal difficulty. While doing the 
test, and for 2 min. before and after it, 
records of muscle tension (EMG) were taken 

The test.—Sitting alone in a cubicle Ss 
were given a sheet of 100 sums and required 
to complete as many as possible in 20 min 
Each sum comprised five two-digit numbers 
to be added, the total to be written down 
and also spoken into a microphone. At the 
15-min. point of the test Z intervened, speak- 
ing to S through a loudspeaker in the cubicle 
He said, ‘Now I want you to work faster and 
more accurately, and to help you I will tell 
you the time you take for each sum and 
whether you get it right or wrong.” This 
knowledge of results (KR) was given through- 
out the last 5 min. of the test. On the previ- 
ous day Ss were given a practice run, the 
procedure, including the recording of EMG 
being exactly the same as in the main tests 
except that the run lasted only 10 min. and E 
did not intervene with KR. 

Sleep deprivation.—In their experimental 
test 6 Ss had been without sleep for som« 
56 hr. and the other 6 for about 32 hr. All 
were tested in the afternoon and this applied 
also to the control tests after normal sleep 
The Ss carried out routine duties and some 
other tests while staying awake but were in no 
way overworked apart from the stress im- 
posed by enforced wakefulness. 

EMG recording.—EMG records were taken 
from a placement over the pronator teres 
muscle of the left (inactive) forearm, Ss being 
asked to allow the arm to hang loosely by 
their side as they sat at the table doing the 
sums or relaxing. A single-channel machine 
of private design was used having an input 
impedance of 250 KQ; 
integrated 
channel of 


pulses reflecting the 


output were recorded 
a tape recorder while the other 
channel recorded by microphone the proceed- 


ings in the test cubicle 


on one 


This record com- 
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Fic. 1. Speed of adding with 
and without sleep. 


SUMS DONE 
KNOWLEDGE OF RESULTS GIVEN 
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prised mainly S’s answers to the sums and E’s 
encouragement and KR. _ Bipolar sponge 
electrodes were used and the skin was abraded 
to give reasonably equal and low resistance 
(between 10 KQ and 3 KQ) from each electrode 
to the reference electrode on the active fore- 
arm. As each S was tested twice and com- 
parisons made between the levels of EMG on 
each occasion it was essential that the place- 
ment and recording should be 
approximately the each time. To 
achieve this a patch of adhesive tape was 
placed over the proposed site of the electrodes. 
There were two holes in this patch, 1} in. 
apart and ;’, in. in diameter. In preparation 
for the first test the skin was abraded through 
these holes and the sponge electrodes placed 
immediately over them. The adhesive patch 
remained in place until the second test 2 or 4 
days later when the electrodes were again 
placed over the holes and the skin abraded 
where necessary to achieve an electrode-to- 
electrode resistance stabilizing at approxi- 
mately the same level as preceded the first 
test. Each test was preceded and followed 
by 2 min. relaxation when S sat back in his 
chair and rested. EMG records were taken 
throughout and the score of “level of EMG”’ 
is the ratio of the average EMG during the 
test to the average during the preliminary 
period of relaxation. A further index of 
EMG is that of its variability in any given S 
during a test. This score of EMG variability 
reflects the variance (calculated as_ the 
coefficient of variability) of the minute to 


sensitivity 
same 
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minute counts of EMG in each of the four 
5-min. periods of the test. 

Statistical treatment.—Significance of single 
means were tested by Wilcoxon’s matched- 
pairs signed-ranks test, and differences be- 
tween means by the Mann-Whitney U test. 
Kendall's rank correlation coefficient (7) was 
used for all correlations. All these procedures 
are described by Siegel (1956). All signifi- 
cance levels refer to two-tail assessments 
except where otherwise stated. 


RESULTS 


This section will give the results and 
their immediate implications; in the 
following section more general im- 
plications will be considered. The 
analysis that follows will be concerned 
mainly with the period of No KR (the 
first 15 min. of the test), but in addi- 
tion we shall consider the changes 
that occurred when this feedback was 
added in the last 5 min. Finally, 
attention will be drawn to a possible 
predictor of the degree to which 
individual performance will be im- 
paired under sleep deprivation. 

Period of No KR.—During the first 
15 min. of No KR sleep deprivation 
had no effect upon errors but it re- 
duced the number of sums done 
(Fig. 1). This result was significant 
(P < .01) when the Practice K Order 
interactions were corrected for as 
follows: half the Ss carried out their 


TABLE 1 


EMG LeveL AND EMG VARIABILITY 
WITH AND WITHOUT SLEEP 


EMG Level* EMG Variability' 
5-Min. Test 
Periods 
Sleep No Sleep Sleep 


1 (No KR) | 1.88 9! .119 
2 (No KR) 1.7 a .205 
3 (No KR) 1.6. ; .137 
4 (KR) 2.88 f .244 


* EMG level is the average EMG count during < 
given 5-min, test period divided by the average EMG 
during the preliminary 2-min. relaxation. 

b EMG variability is the coefficient of variability (V) 
of the minute to minute counts of EMG. 
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first test under sleep deprivation and 
the second with normal sleep; for the 
other half the order of the conditions 
was reversed. All improved with 
practice from first to second test and 
the null hypothesis was that if sleep 
deprivation had no effect the practice 
effects of the two groups of Ss would 
not differ. Lack of sleep had little 
effect upon the level of EMG but it 
increased EMG variability (P < .02). 
hese trends are shown in Table 1. 
To examine concurrent trends of 
performance and EMG, Ss_ were 
ranked in order of impaired perform- 
ance and of increased EMG due to 
loss of sleep, and the correlation of the 
two rankings assessed. ‘This 
operation had to be performed sepa- 
rately on each of the four teams of 3 Ss 
treated alike with respect to order and 
degree of sleep deprivation. The 
combined significance of the correla- 
tions was then assessed on a permuta- 
tional basis to give an overall level of 


was 


significance over all 12 Ss. To explain 
this further there are six possible 
combinations of the rankings of two 


We have four 
teams in each of which any of these six 
combinations may occur. Over all 
four teams there are then 6* = 1296 
possible combinations of rankings. If 
we emerge with a combination of 
rankings whose correlations are pre- 
dominantly negative, for example 
—1.0, —1.0, —0.33, and —0.33, in the 
four teams we can calculate the 
number of combinations out of the 
whole 1296 which are as negative as 
this or more negative. There are 41 
in this case. The one-tailed prob- 
ability of a negative combination as 
great or greater than then 
41/1296 or .031. 

The negative correlations which 
emerged were almost all significant. 
Increased level of EMG due to loss of 
sleep correlated negatively with im- 


sets of three scores. 


this is 


DURING MENTAL WORK 


IMPAIRED PERFORMANCE 


SLEEP DEPRIVATION 


DUE TO SLEEP DEPRIVATION 


LEVEL 


DECREASE <e— ———= _ INCREASE 


INCREASE IN EMG 


is? 2 3 
S-MIN. PERIODS OF THE TEST 


Fic. 2. Increase in EMG level due to 
sleep deprivation, i.e., Logio (No Sleep EMG 
level /Sleep EMG level) in three groups of Ss 
showing the least, the most, and an inter- 
mediate impairment of performance due to 
sleep deprivation. 


paired performance in terms of both 
speed (P = .031), and = accuracy 
(P = .094). Similarly increased vari- 
ability of EMG under sleep depriva- 
tion correlated negatively with re- 
duced speed (P = .061) and reduced 
accuracy (P = .007) ‘Thus when 
sleep was lost those Ss whose per- 
formance was impaired least were the 
ones whose EMG was raised most and 
this holds good whether we correlate 
speed or accuracy of performance with 
either level or variability of EMG. 
To illustrate this (in terms of speed 
only) Ss have been divided into three 
groups containing the members of 
each team showing the least, the most, 
and an intermediate impairment of 
performance due to lack of sleep in 
the first 15 min. of the test. The 
tendency for these groups to show 
increased EMG as a result of losing 
sleep can be seen in terms of level of 
EMG in Fig. 2 and its variability in 
Fig. 3. Clearly there is an almost 
complete separation of the three per- 
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5 VARIABILITY DUE TO SLEEP DEPRIVATION 
INCREASE 


DECREASE —= 


INCREASE INEM 


2 3 a™ (K 
5-MIN. PERIODS OF THE TEST 

Fic. 3. Increase in EMG variability due 
to sleep deprivation, i.e., Logiy (No Sleep 
EMG variability /Sleep EMG variability), in 
Ss with the least, the most, and an 
mediate impairment of performance 
sleep deprivation. 


inter- 
due to 


the extent 
which their muscle tension rose as a 


formance groups in 
result of working without sleep. 
These results seem very clear in this 
particular context, but they should be 
considered with due regard the 
limitations of the experiment. They 
apply to only one form of activity. 
Only one physiological measure was 
taken, the EMG, and this was re- 
corded from only one site. 


to 


The con- 
clusions which follow immediately and 
in later discussion should be regarded 


therefore as topics for confirmatory 
experiment rather than firm proposi- 
tions. 


There are two immediate con- 
clusions. The first is that although 
some men may be able to forego sleep 
and perform as well as normally on 
tasks of the present nature, this per- 
formance may be accompanied by 
abnormally high levels of muscle ten- 
sion. Secondly, we may recall that 
Edwards (1941) concluded from inci- 
dental observation that work under 
sleep deprivation is accompanied by 
abnormally high expenditure of effort. 
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lf we can assume that higher and more 
variable EMG is a sign of such effort 
the present result may provide more 
direct experimental evidence for this. 
A possible corollary is that sleep de- 
prived men may be more uniformly 
inefficient than has been thought 
hitherto if we interpret efficiency in a 
mechanical sense as being the ratio of 
output to input. Previous implica- 
tions have often been that where per- 
formance is maintained, also is 
This may only be true if 
remains the same also, and 
present results suggest that this is not 
always the case. Where output was 
maintained effort or input as judged 
by the EMG, was often higher. In 
such cases sleep deprivation may be 
reducing efficiency no less than when 
no extra effort is made and output 
falls. 

Period of KR.—Errors showed no 
important changes as a result either of 
adding KR, or of sleep deprivation 
when this feedback present. 
Performance is discussed therefore in 
terms of speed only. 


so 
efficiency. 
effort 


was 


When KR was given in the last 5 
min. there ceased to be any difference 
between sleep deprived and normal 
performance (Fig. 1). Previous work 
(Wilkinson, 1961) has led us to expect 
this, but in the present experiment the 
result was brought about in an un- 
usual way. When KR was added 
under sleep deprivation it raised EMG 
moderately and improved perform- 
ance. When it was added after 
normal sleep, however, it raised EMG 
much more (P < .01) and this was 
accompanied by a deterioration in 
performance. These changes can be 
seen in Table 1 and Fig. 1. Now 
Stennett (1957) has shown that, be- 
yond a certain point, increases in 
EMG may lower performance rather 
than improve it and it seems reason- 
able to account in this way for the 
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decline in performance among the 
sleepers when KR was added. They 
became the circum- 
stances it is not surprising to find that 
the negative correlation of the first 
15 min. between impaired perform- 
ance and increased EMG under sleep 
deprivation was reduced almost to 
zero when KR was added in the last 
5 min. The lesson from this is that 
we should be careful not to generalize 
too far from results obtained in 
a relatively unstimulating situation 
(like the first 15 min. of the present 
test) to one in which incentives make 
S anxious to do well. In terms of 
efficiency as defined above the non- 
sleepers were 


overtense. In 


no longer at a dis- 
advantage when KR was added; in- 
deed it could be argued that they were 
more efficient than the sleepers, for 
they performed as well and their EMG 
was lower. Clearly of the 
nature must be extended to 
more stimulating tasks. 


research 
present 
Prediction of individual impairment 
from EMG under normal conditions. 
Subjects may be ranked in terms of 
the ratio of their working level of 
EMG to that of their preliminary 
2-min. period of relaxation, which, 
indeed, has been the index of level of 
EMG throughout. 
ent this 
tained from each 5S, 


Three independ- 

kind 
the first from the 
initial practice test and the second and 
third the 
with and one without sleep. 
three 


measures of were ob- 


from two main tests, one 
These 
measures are in considerable 
agreement in their rankings of Ss, 
Kendall's coefti ient ol concordan 4 
.67 and significant (P 03). 
This suggests that the extent to which 
-MG the 
resting to working varies consistently 
from Table 2 


summarizes the results of correlating 


being 


rises in transition from 


person 1oO person. 


these three assessments of this pa- 


rameter with impairment of perform- 
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ance due to lack of sleep in the periods 
of No KR, of KR, and in the two 
combined, that is the whole test. 

All the correlations involving speed 
of performance are negative, most of 
them significantly so. 


The main test 
after normal sleep is a reliable pre- 


dictor, but the data concerning the 
practice test are the most interesting 
in that this measure was a truly pre- 
dictive one, that is its results were 
quite independent of those to be 
predicted, namely the effect of lack of 
sleep on individuals. Unfortunately 
this value of the practice test as a 
predictive 
when all 


measure appeared only 
the results were analyzed. 
It was administered as no more than a 
practice run and with less care over 
EMG recording than was exercised in 
the main tests. 
yields 


But in spite of this it 
values of working-to-resting 
EMG ratio which predict in advance 
the impairment of speed of perform- 
ance under sleep deprivation with 
fair accuracy and at nearly the .05 
level of significance. The rankings 
correlate (r = .44) (P = .023) 
with those of the highly predictive 
main carried out under normal 
sleep. In short there 


also 


test 


seems egor xd 


TABLE 2 
CORRELATIONS (7) OF THREE MEASURES OF 
WoORKING-TO-RESTING RATIO OF EMG 
WITH THREE INDICES OF IMPAIRED 


PERFORMANCE (SPEED) UNDER 
SLEEP DEPRIVATION 


Working Whole Test 


Re R NoKR +KR 


Ratio of EMG 


\ll + coefficients are negative 
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reason to believe that if a careful 


preliminary assessment of the ratio of 
working-to-resting EMG is made this 
index should predict the degree to 
which performance will be impaired 
by lack of sleep in any subsequent 
performance of the task, the higher 
the ratio the less the impairment. 


DISCUSSION 


Muscle tension (EMG) is one of a 
number of physiological measures which 
are sometimes (Malmo, 1959) thought to 
reflect the level of arousal of the body as 
defined by Duffy (1957), Lindsley (1951), 
and Hebb (1955). Other possible meas- 
ures include pulse, respiration and meta- 
bolic rates, skin conductance, urinary 
excretion of catechol amines, and alpha 
depression in the EEG. When these are 
recorded under sleep deprivation their 
levels are sometimes higher than normal 
(Freeman, 1932; Hasselman, Schaff, & 
Metz, 1960; Laird & Wheeler, 1926; 
Malmo & Surwillo, 1960; Tyler, Good- 
man, & Rothman, 1947) and sometimes 
lower (Armington & Mitnick, 1959; Ax 
& Luby, 1961; Bjerner, 1949). Similar 
contrasts occurred with EMG in the 
present experiment. The fact that in- 
creased EMG under the stress correlated 
positively with maintained performance 
suggests that this, and perhaps other 
physiological indices may rise under 
sleep deprivation as the experimental 
situation is stimulating and provokes 
effort. If we examine the conditions 
under which physiological measures were 
taken in previous experiments the im- 
pression is reinforced; where levels in- 
creased the Ss were usually engaged in 
relatively stimulating tasks; where they 
fell the tasks appear less stimulating or 
else the Ss were merely sitting passively. 

If these physiological indices reflect 
the level of arousal we must conclude 
with Malmo and Surwillo (1960) that 
sleep deprivation can either raise or lower 
arousal according to the situation in 
which the S is placed during recording. 
But do they? With No KR in the 


present experiment performance was im- 
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paired under sleep deprivation but the 
level of EMG was unchanged (Fig. 1 and 
Table 1); if this implies unchanged 
arousal the relationship between arousal 
and performance is broken. Similarly 
with KR different levels of EMG accom- 
panied the same level of performance 
with and without sleep. Perhaps if we 
wish to retain the inverted U relationship 
between arousal and performance (Hebb, 
1955) we must sacrifice the notion that 
EMG level always reflects the level of 
arousal. In particular when sleep has 
been lost it seems likely that higher levels 
of EMG are required for given levels of 
arousal. This suggests an explanation of 
the abnormally high levels of the so- 
called arousal measures which occurred 
under certain circumstances in the 
present and other experiments: they may 
reflect, not raised arousal, but the effort 
associated with maintaining normal 
arousal and customary standards of per- 
formance in face of the influence of sleep 
deprivation per se which may be always 
towards lowered arousal. 


SUMMARY 


Twelve Ss performed a 20-min. test of 
addition, once after normal sleep and once 
under 32-56 hr. sleep deprivation. Records of 
muscle tension (EMG) were taken from the 
inactive arm. The Ss who maintained per- 
formance best under the stress showed the 
greatest rise in EMG over normal levels. 
Knowledge of results disturbed this relation- 
ship. An independent measure of EMG 
taken under normal conditions predicted 
those Ss whose performance was impaired. 
Sleep deprivation may cause inefficiency even 
in Ss who maintain performance if their 
raised EMG reflects greater effort or energy 
expenditure; this may be the cost of maintain- 
ing normal levels of arousal and performance 
in face of the depressing influence of sleep 
deprivation per se. 
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BOUSFIELD, 


AND G. A. WHITMARSH 4 


University of Connecticut 


In recent papers Bousfield, Cohen, 
and Whitmarsh (1958b), and Bous- 
field, Whitmarsh, and Danick (1958) 
have attempted to account for the 
phenomenon of verbal stimulus gen- 
eralization on the basis of the overlap 
of verbal associative responses elicited 
by the given words. Their basic ap- 
proach rests upon the assumption that 
the presentation of a meaningful stim- 
ulus word leads to the elicitation of a 
composite of implicit verbal associa- 
tive responses. They reason that the 
verbal conditioning of a response to a 
stimulus word involves not only the 
conditioning to the stimulus word, 
but also a simultaneous conditioning 
to the composite of verbal associative 
responses to that word. Thus, during 
conditioning trials, members of the 
associative response composite are 
involved in the learning 
through higher-order conditioning. 
The term parasitic reinforcement may 
be used to describe this concurrent 
conditioning of the members of the 
composite of verbal associative re- 
sponses. This term was introduced 
by Morgan and Underwood (1950) to 
explain the following phenomenon. 
After the learning of a given verbal 
response, B, to a stimulus word, A, the 
synonyms of B will have a greater 


process 


1 This paper is based on Technical Report 
No. 32 under Contract Nonr-631 (00) between 
the Office of Naval Research and the Uni- 
versity of Connecticut. Reproduction in 
whole or in part is permitted for any purpose 
of the United States Government. 

2Now at Androscoggin Mental 
Clinic, Lewiston, Maine 

3 Now at the Springfield State 
Sykesville, Maryland 


Health 


Hospital, 


5 


/ 


than chance probability of subse- 
quently being elicited by A. Bous- 
field, Whitmarsh, and Danick (1958) 
extended the concept of parasitic rein- 
forcement to include the aggregate of 
verbal associative responses to a given 
stimulus word. Support for the as- 
sumption underlying the concept of 
parasitic reinforcement was found in 
the fact that the degree to which an 
observable response which had been 
conditioned to one word was elicited 
by the presentation of a second word 
was a function of the verbal associa- 
tive responses common to the two 
stimulus words. Studies by Cohen 
(1958) and by Whitmarsh and Bous- 
field (1961) have replicated these find- 
ings, and have shown them to be 
independent of a specific technique 
used to measure generalization. While 
the theoretical rationale introduced to 
account for the contribution of asso- 
Clative responses to generalization as- 
sumed the conditioning of the implicit 
verbal associates of the first stimulus 
word to the observable response, 
these studies provide only indirect 
support for this assumption. 

The present study was undertaken 
to test the deduction that after paired- 
associate learning the associates of the 
learned response word may also be 
elicited by the stimulus item of the 
learned pair. Specifically we wished 
to test the following hypothesis: the 
paired-associate learning of a mean- 
ingful response word to a nonsense- 
syllable stimulus has the consequence 
of establishing connections between 
the nonsense syllable and the members 
of a group of verbal associative re- 


? 





VERBAL ASSOCIATIVE 


RESPONSES 


rABLE 1 


MEANINGFUL RESPONSI 


AND THEIR TESTED 


f the Learne 


High 


ANIMAI Dog 
ICI Cold 
LETTUCI 
MOSQUITO 
PETAI 
RAYON Silk 
rABLI Chair 
TiN Can 
rYPHOID 
WAGON 


| omato 
Bite 
Flower 


Fever 


Wheels 


Mean 


Note See text for definition o 


sponses to the learned response word. 
For example, if the word RAYON were 
learned as a response to the nonsense 
syllable Gox, we should expect to find 
evidence of acquired connections be- 
tween GOx and the re- 
sponses to RAYON, as for example, 
Silk, Nylon, Material, and Soft. The 
relative strengths of the members of 
the composite of verbal associative 


associative 


responses to a given word may be 
measured from their cultural fre- 
quencies of occurrence as responses to 
that word in free associational norms 
of the Minnesota (Russell & 
Jenkins, 1954). Our second experi- 
mental hypothesis concerns the rela- 
tion between cultural habit strengths 
of the associates of a given word and 


type 


their susceptibility to conditioning: 
the strength of the connections es- 


tablished between the nonsense syl- 


lable and the associates of the learned 
response word is an increasing func- 
tion of the cultural habit strengths 
of the associates as responses to the 
learned response word. 


EXPERIMENT | 
Method 


The materials for the initial learning were 


10 pairs of nonsense syllables and meaningful 


Worps UseEpb IN 
ASSOCIATES : 


1 Responses and their (¢ 


[RAINING 


Exp. I 


iltural Frequencie 


Mediu 


Cat 
Water 
Green 
Bug 
Rose 
Nylon 
Write 
Metal 
Disease 
Train 


Man 
Cream 
Leaf 
Inse« t 
Leaf 
Material 
Oftice 
Roof 
Sickness 
Red 


dea OU we eu 


oo 
a 


words The following 10 syllables 


were selected from the Glaze (1928) list on the 


nonsense 


basis of their having association values rang- 
ing from 0 to 47%: GOX, HAJ, MUP, NID, QOL, 
RUC, SIW, VEK, YEF, and zAB. The 10 mean 
ingful words, which are listed in Table 1, 
were from a list of 150 words for 
Ww hic h free-assox iational norms had been com- 
piled from a population of 150 Ss 


selected 


Three 
different randomized pairings of these items 
then that no 


syllable was paired with the same word more 


were prepared so nonsense 


than once. The items for the testing phase 
of the experiment free 
responses to the 10 response members of the 


were associational 
initial learning pairs selected on the following 
The 159 
each of the 10 learned 
tertiles on the 
frequencies of occurrence in 


basis. associational 


responses to 
were divided 
their 
the 


The associates in each of these three 


words 


into basis of cultural 
normative 
data. 
low-, 


groups then comprised the pools of 


medium-, and high-frequency associates 
In « hoosing associates for the testing phase ol 
the experiment the restriction was imposed 
that a chosen associate to a given word should 
not appear in the gradient of associational 
responses to any of the other 9 learned words 
Within this restriction, the associate having 
the 


group 


Cat h 
the 
words for a given learned word 
thus 
learned words are listed in 


highest frequency in frequency 
three 
The three 

each of the 10 
lable 1 along with 


their corresponding cultural frequencies of 


was chosen as one of test 


issociates chosen for 


occurrence as associates to the learned word 
It may be noted that Leaf appears as a low- 


frequency associate to both LETTUCE and 
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PETAL. This violation of the restriction im- 
posed in the selection of associates was 
necessitated by the relative lack of degrees of 
freedom imposed by the limited pool of 150 
stimulus words. 

The Ss were 140 undergraduate students 
who were trained and tested in three groups 
comprising 51, 44, and 45 Ss, respectively. 
Each group received a different randomization 
of the paired-associate lists for learning. The 
items were presented one at a time to Ss by 
means of a Selectroslide projector set for 
exposures of 2.5 sec. The instructions and 
procedure employed for the paired-associate 
learning were the same as those devised by 
Cohen (1958) for group-method experiments. 
Eight learning trials were administered to all 
Ss. On alternate trials Ss were asked to 
anticipate and write in a booklet provided 
for this purpose the response member of the 
pair when shown the nonsense syllable. They 
were then verbally presented with the correct 
word. A total of 27 Ss failed to reach the 
criterion of all correct anticipations on the last 
trial and were therefore dropped. This re- 
sulted in Ns of 43, 33, and 37 Ss, respectively, 
for the three experimental groups. 

After the initial learning, E proceeded 
immediately to the testing phase. A pilot 
study had demonstrated that the procedure 
of simply presenting the nonsense syllable with 
instructions for free association was effective 
in eliciting associates to the learned words in 
only 40% of the cases. This consideration led 
to the development of an alternative pro- 
cedure. The S was given 10 data sheets, in 
booklet form, one sheet for each of the non- 
sense syllables of the training phase of the 
experiment. The nonsense syllable was fol- 
lowed by five words, one of which was one of 
the three chosen associates of the learned 
word. For example, for Ss who learned the 
pair GOX-RAYON, one of the five words was 
Material, a low-frequency associate of RAYON. 
The remaining four control words were 
selected at random from a dictionary with the 
restriction that they should not appear as an 
associate of any one of the 10 words used in 
the learning. The S was instructed to check 
the one word which he felt to be ‘“‘most related 
to the nonsense syllable.” * The order of the 
syllables in the booklet was randomized be- 
tween Ss as was the position of the critical 


‘In a subsequent study these instructions 
were rephrased so that Ss were asked to 
check the one word which the stimulus item 
“most makes you think of." There was no 
evidence to indicate the choices of Ss were 
altered by this change in the instructions. 
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associate among the four control words. 
Each group of Ss received booklets containing 
high-, medium-, and low-frequency associates 
distributed among the 10 nonsense syllables 
and among the three groups of Ss in a counter- 
balanced design. This design required the 
use of 120 control words. 

With the forced-choice test instructions it 
might be supposed that the probability of 
selecting any one of the five choices would 
be .2. Such an assumption, however, ap- 
peared unsafe in view of the possibility that an 
alternative might be selected on the basis of 
extraneous factors such as phonetographic 
similarity to the nonsense syllable. It ap- 
peared advisable, therefore, to obtain what 
may be called base frequency data from a 
group of control Ss. For this purpose three 
control groups of undergraduates comprising 
51, 33, and 34 Ss, respectively, were presented 
the test booklets and the same instructions as 
were given to the three groups of experimental 
Ss. Thus, each of the 30 forced-choice 
association tests, i.e., 10 for the low-, 10 for 
the medium-, and 10 for the high-frequency 
responses, was taken by 51, 33, or 34 Ss. 


Results 


The first step in the treatment of 
the data was that of tabulating the 
total frequency with which each of the 
30 associates used in the testing was 
selected as most related to its asso- 
ciated syllable. For example, the pair 
MUP-MOSQUITO appeared in the initial 
learning. In the testing situation the 
group of 43 Ss who had received this 
pair for learning was presented with 
MUP and asked to select the word most 
related to MuP from five alternatives 
comprising Insect, the low frequency 
associate of MOSQUITO, and the control 
words Knife, Field, Crazy, and Word. 
In view of the predicted facilitation 
of the associative responses to MOS- 
QuITO, the checking of Insect as the 
preferred alternative was for con- 
venience labeled ‘“‘correct.’’ Control 
word choices were designated as ‘“‘in- 
correct.’”’ In these terms all 43 of the 


experimental Ss who were presented 


with this set of choices gave “‘correct”’ 
responses, whereas 17 of the 51 control 
Ss who received the same test gave the 
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TABLE 2 


EXPERIMENTAL (E) AND ControL (C) Ss SELECTING EACH ASSOCIATI 
DIFFERENCES (E—C) BETWEEN THESE PROPORTIONS: Exp. I 


PROPORTION OF 
AND THI 


High Medium Low 


ANIMAL ; 75 .973 
ICI : 5 843 .784 
LETTUCI : ; .653 .970 
MOSQUITO 659 .939 
PETAL 659 838 
RAYON , 747 970 
TABLE 8. 662 A424 
TIN ‘ .671 .953 
r'YPHOID } 697 000 
WAGON .636 .953 


Mean .698 


Note All E—C values are significant in the predicted direction at less than the .01 level 


so-called ‘“‘correct’’ responses and 34 associates 
gave responses labeled ‘‘incorrect.”” A 
chi square analysis of these data with 
a two-tailed test indicates that the 
difference between experimental and 
control group responses is significant 
beyond the .01 level in the direction 
predicted by the experimental hy- 
pothesis. A similar treatment of the 
experimental and control group data 
for the remaining 29 associates indi- 
cated that all differences were signifi- 


frequency and least for 
those of low frequencies. The follow- 
ing steps were taken in this analysis. 
First, the proportion of Ss who gave 
the so-called correct responses was 
determined for each of the 30 associ- 
ates listed earlier in Table 1. These 
proportions appear in Table 2, and are 
listed in Column E for the experi- 
mental Ss and in Column C for the 
control Ss who supplied the base 
frequency data. Thus, the high-, 


cant beyond the .01 level in the pre- 
dicted direction. 

The next step taken in the analysis 
of the data was that of determining 
the nature of the relationship between 
the cultural frequencies of the asso- 
ciates as represented in the three 
groups of high, medium, and low on 
the one hand, and the extent to which 
these facilitated in 


associates were 


the testing phase of the experiment. 
The mean cultural frequencies of these 


based on the normative 
population of 150 Ss, were, respect- 
ively, 62.1, 18.7, and 8.4. The predic- 
tion was that the number of responses 


associates, 


labeled correct by the experimental 
Ss should be greatest for the high- 


medium-, and low-frequency associ- 
ates of ANIMAL were, respectively, Dog, 
Cat, and Man. Table 2, Column E, 
shows that the proportion of experi- 
mental Ss who checked Dog as related 
to the nonsense syllable which had 
been paired previously with ANIMAL 
was .953. The proportion of control 
Ss, Column C, who selected Dog was 
.196. The difference between these 
proportions, .757, is listed in Column 
E—C. This difference may be said 
to represent the effect of learning. 
As indicated earlier, this difference is 
significant. The means of these ad- 
justed proportions for the high, me- 
dium, and low associates are, respect- 
ively, .698, .683, and .680. Three CR 
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the differences between the 
proportions were performed for these 
three adjusted means. The differ- 
ences between the means of the ad- 
justed proportions for the high vs. 
medium, medium vs. low, and high vs. 
low groups are .015, .003, and .018, 
respectively. These mean differences 
do not differ significantly. Thus, 
while the findings of Exp. I support 
the first hypothesis, the variation in 
strength of the associates of the re- 
sponse words used in the initial train- 
ing did not prove to be a significant 
parameter as predicted in the second 
hypothesis. 


tests of 


EXPERIMENT I] 


In light of the unexpectedly strong 
effects of the so-called low-frequency 
associates in Exp. I, Exp. II was 
undertaken to extend the range of 
cultural frequencies tested to associ- 
ates occurring only once in the norma- 
tive population of 150 Ss. 


The nonsense-syllable meaningful- 
word pairs used in Exp. I were learned by the 


same 


TABLE 3 


MEANINGFUL RESPONSE WorpDs USED IN 
TRAINING AND THEIR TESTED 
AssociaTEs: Exp. II 


Associates of the Learned Responses 
and Their Cultural Frequencies 
of Occurrence 


Words | sed 
as Responses 
in Training 


Low Low-Low A|Low-Low B 


ANIMAI Bear 
ICE Berg 
LETTUCE | Money 
MOSQUITO | Gnat 
PETAL Push 
RAYON Soft 
TABLI Paper 
TIN Copper 
rYPHOID Illness 
WAGON Red 


Human 
Winter 
Chow 
Nasty 
Brake* 
Yarn 
Brown 
Pail 
Neck 
Drunk®* 


Ugly® 
Hard 
Potato 
Pest 
Fall 
Skirt 
Book 
Rubber® 
Gear* 


Children 
Mean - 1 1 
* Indicates the five associates which experimental Ss 


did not select with frequencies significantly different 
from the choices indicated in the base frequency data. 
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136 undergraduate Ss participating in Exp. II. 
The Ss were trained and tested in three groups 
of 43, 49, and 44 Ss, respectively. The 
critical associates tested in Exp. II were 
divided into three groups: one of 10 associates 
with cultural frequencies of either 3 or 4 
which is designated the Low group, and two 
groups of 10 associates, each associate having 
a cultural frequency of 1. These groups were 
designated, respectively, Low-Low A and 
Low-Low B. The associates having the fre- 
quency of 1 were selected at random from the 
normative data with the restriction that they 
did not appear among the free associates to the 
other 19 words in the two Low-Low lists. The 
same forced-choice test procedure, counter- 
balanced experimental design, and instruc- 
tions used in Exp. I were employed. The 
items used in Exp. II are presented in Table 3. 
A total of 27 Ss served as controls and pro- 
vided the base frequency normative data used 
in thisexperiment. An analysis of the norma- 
tive data collected in Exp. I indicated that 
adequately stable data could be obtained with 
an N of this size. 


Results 


In Table 4, Column E shows the 
proportion of experimental Ss who 
gave the so-called correct responses 
for each of the 30 associates used in 
this experiment. Column C lists the 
proportions of “‘correct’’ responses for 
the control Ss, and Column E-—C 
shows the differences between the 
experimental and control proportions. 

Individual chi square tests were 
performed on the differences between 
the ‘‘correct’”’ choices of the experi- 
mental Ss and the control Ss for each 
of the 30 associates. This analysis 
indicated that with one exception all 


associates in the low-frequency group 


were chosen by Ss at significance 
values beyond the .01 level in the 
direction predicted by the experi- 
mental hypothesis. The associate of 
PETAL, namely, Push, was significant 
at the .05 level. Thus, the results 
strongly confirm the first experimental 
hypothesis even when the cultural 
frequencies of the associates tested are 
further reduced in magnitude as com- 





VERBAL ASSOCIATIVE RESPONSES 


rABLE 4 


PROPORTION OF EXPERIMENTAL (E) AND CONTROL 
-C) BETWEEN 


AND THE DIFFERENCES (E 


Word 


ANIMAI .909 
ICI .864 
LETTUCI 721 
MOSQUITO 861 
PETAL 341 
RAYON .861 | 
TABLI 302 
TIN 1.000 
TYPHOID 837 
WAGON .673 


835** 

568** 
.610** 
.187** 
.156* 
— 
.191** 
.630** 
566" 


488** 


Mean 541 


(C) Ss SELECTING EACH ASSOCIATI 
THESE PrRoporRTIONS: Exp. II 


Low-Low A Low-Low B 


* Mean E —C of Low-Low A and Low-Low B combined 


* 01 <P < .05 
=P < 01 


pared to the low-frequency associates 
of Exp. |. Similarly, chi square tests 
were made on the 20 Low-Low asso- 
ciates having cultural frequencies of 
occurrence of 1. This 
indicated that 15 of these associates 
were selected by the experimental Ss 
at or beyond the .05 level of signifi- 
cance when compared with the base 
frequency data by means of two-tailed 
chi square tests. The five associates 
which did not attain significance are 
indicated in Table 3. 

tions were established 


analysis 


Thus, connec- 

between the 
nonsense syllables and 75% of the 
Low-Low the learned 
response words even when the cultural 
frequencies of these 
responses to the learned words were so 
low as to occur only once in a norma- 
tive group of 150 Ss. 

Several comparisons were made be- 
tween the data provided by the two 
experiments. The means of the E—C 
proportions of the two Low-Low 
groups used in Exp. I] were combined 
after a CR test indicated that the 


associates of 


associates as 


difference between these two groups 


was not significant. Two-tailed CR 


tests for uncorrelated proportions 
were made on the differences between 
all frequency groups in Exp. I and 
those in Exp. II and between the low- 
frequency group and the combined 
Low-Low Exp. II. No 
significant differences between any ol 
these adjusted mean proportions were 
obtained. Although there is a trend 
of decreasing mean proportions ol 
correct responses for the frequency 
groups used in both experiments, the 
statistical indicated that 
none of the means involved in this 
trend showed significant differences be- 
tween each other. Even the differ- 
ence between the data for the High 
group of Exp. | and the combined 
Low-Low groups of Exp. II was not 
significant. The second hypothesis 
was not supported. 


groups of 


analyses 


DISCUSSION 


The findings support the assumption 
that the learning of a meaningful verbal 
response to a nonsense syllable stimulus 
results in the establishment of 
able 
the stimulus and the verbal associative 


measur- 


associative relationships between 





D. KINCAID, JR., W. A 
responses to the meaningful word. Two 
alternative explanations of this phe- 
nomenon may be considered. The effect 
may be a consequence of mediation in 
the testing phase of the experiment 
provided by recall of the learned response 
The S 
recall that he has learned, for example, 
RAYON as the response to Gox. The 
associates of RAYON, namely, Soft, Ma- 
terial, etc., are then mediated by the 
recall of RAYON, and S proceeds to check 
the associate Soft as the response most 
related to Gox. On the other hand the 
theoretical approach of Bousfield and his 
associates suggests that the phenomenon 
is attributable to the higher-order condi- 
tioning of the implicit verbal associates 
of the learned response word during the 
training phase of the experiment. A test 
of the assumption of the training phase 
locus of the effect would require the 
demonstration of parasitic reinforcement 
of the when the 
learned response had been forgotten and 
was no longer available to S._ Failure 
to demonstrate the phenomenon under 
this condition, would not 
necessarily indicate that the locus was in 
the testing phase since it may very well 
be that the time interval necessary for 
the forgetting of the originally learned re- 
sponse word is sufficient for the 
forgetting of the associational responses. 
While the locus of the effect has not been 
tested directly, support for a 
training phase locus may be found in a 
study by Yavuz and Bousfield (1959) 
who showed that the connotative mean- 
ing of a foreign word could be recalled 
after the supposed English translation of 
the word had been forgotten. They sug- 
gested that the conditioning of the asso- 
ciational responses of the English word 
in the training phase mediated the mean- 
ing judgments of Ss. 

It would seem that the failure to find 
a differential effect as a function of the 
habit strengths of the associative re- 
sponses may be attributed to either of 
two factors or to a combination of these 
factors. In the first place, it is evident 
that the findings here reported derive in 
part from the use of a particular method 


to the nonsense syllable. may 


associative responses 


howey er, 


also 


some 


BOUSFIELD, AND G. 


A. WHITMARSH 


for appraising the presence of the asso- 
ciative connections assumed to have been 
established in the initial learning. In 
each of a series of tests, the Ss were given 
one of the nonsense syllables encountered 
in the initial learning which was followed 
by five different words. The Ss were 
told to choose the one word of these five 
which they judged to be most related to 
the given nonsense syllable. In each 
case one of the five alternatives was an 
associate of the word learned as a re- 
sponse to the nonsense syllable. Accord- 
ing to the theory outlined by Bousfield 
et al. (1958), this associate should have 
been elicited implicitly in contiguity with 
the presentation of the nonsense syllable 
during learning. It therefore be 
said that the testing method actually 
employs the method of recognition. This 
method typically yields relatively high 
scores in tests of retention as long as the 
learned items are embedded in dissimilar 
new items as was the case in the present 
study (Luh, 1922). The sensitivity of 
the method of recognition is most likely 
due to the opportunity it provides S for 
making use of relatively weak 
ciations. 


may 


asso- 
An alternative explanation of the 
failure of the findings to discriminate 
between the strengths of the associative 
habits is possible. It is conceivable that 
associative response strengths repre- 
sented by a cultural frequency of 1 in a 
population of 150 are of sufficient potency 
in certain situations to become as effect- 
ive as the associative responses whose 
strengths are reflected in higher fre- 
quencies of occurrence. If this is so, it 
would suggest that more attention needs 
to be paid to the so-called weak associa- 
tive habits in the study of verbal be- 
havior. Perhaps these habits are not as 
weak in effect as might be supposed from 
their cultural frequencies of occurrence. 
A similar phenomenon has been found 
in several studies employing Thorndike- 
Lorge frequency of usage values in which 
differences in performance as a function 
of high- or low-frequency values, while 
significant, are small in absolute differ- 
ences (Bousfield, Cohen, & Whitmarsh, 


1958a; Hall, 1954). In discussing this, 
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REVERSAL AND NONREVERSAL SHIFTS WITHIN 


AND BETWEEN 


DIMENSIONS IN 


CONCEPT FORMATION 


I. DAVID 


ISAACS anp CARL P. 


DUNCAN 


Northwestern University 


In a number of studies of concept 
learning in human adults (Buss, 1953, 
1956: Harrow & Friedman, 1958; 
Kendler & D’ Amato, 1955; Kendler & 
Mayzner, 1956), Ss were first rein- 
forced for different responses to two 
stimuli varying on some dimension 
(e.g., circle vs. square, form dimen- 
sion) while the stimuli simultaneously 
varied on one or more momentarily 
irrelevant dimensions (e.g., color). 
After mastery of this task (hereafter, 
the training task), some Ss were 
shifted to a transfer task in which each 
of the two stimuli that had been 
reinforced in training was now paired 
with the opposite response (reversal 
shift). Thus, in transfer, reversal Ss 
had to learn two re-paired S-R 
associations. Other Ss were shifted to 
a transfer task provided by reinforcing 
the stimuli (e.g., red vs. blue) on a 
previously irrelevant dimension (non- 
reversal shift to a different dimension). 
All of the studies using human adults 
as Ss (those cited above, the only ones 
of concern here) consistently found 
that nonreversal shift to a different 
dimension provided a more difficult 
transfer task, in terms of trials to 
learn, than reversal shift. 


This finding has been used to sup- 
port a mediation theory of the way 
human adults learn and transfer in 
such concept tasks (for details, see, 
e.g., Goss, 1961; Kendler & D’Amato, 


1955). However, a mediation theory 
also predicts, according to Kendler 
and D’Amato, that the reversal 
condition should yield positive, not 
negative, transfer in comparison to a 


control group that learns only the 
transfer task. Since it is usually found 
that re-pairing of S-R associations 
produces negative transfer (e.g., 
Porter & Duncan, 1953), it is im- 
portant to determine if this prediction 
can be confirmed. Of the three studies 
that used a control group, one (Kend- 
ler & D'Amato, 1955) did find that 
the reversal group learned the transfer 
task more quickly than the control 
group; one (Buss, 1953) found the 
control learned faster than the reversal 
group; and one (Harrow & Friedman, 
1958) found no difference. This dis- 
agreement among the studies is prob- 
ably unimportant because, it is sug- 
gested here, none of the studies 
actually used an appropriate control 
group. In all cases the control group 
learned only the transfer task; no 
attempt was made to equate control 
and experimental groups on non- 
specific transfer variables (e.g., learn- 
ing to learn, warm up) which would 
be developed in the experimental 
groups by the training task. ‘Since 
nonspecific transfer factors are iikely 
to have a net positive transfer effect, 
performance of the: control groups in 
the three cited studies was probably 
poorer than would have been the case 
if nonspecific transfer had been con- 
trolled. So, the present study is a 
further comparison of reversal shift 
(R) and nonreversal shift to a 
different dimension (NRD) in trans- 
fer, along with an attempt to provide 
a more appropriate transfer control 
for these groups. 
In addition to NRD 


the usual 
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condition, it is also possible, as Har- 
row and Friedman (1958) point out, 
to provide another kind of nonreversal 
shift in transfer, viz., norreversal 
shift on the same dimension that was 
relevant in training (NRS). Harrow 
and Friedman suggest that this NRS 
condition should like the R 
condition, be easier to learn, in 
transfer, than NRD. This prediction 


also, 


is also tested in the present study. 


METHOD 


Apparatus.—The S and E, seated on 
opposite sides of a table, were separated by a 
vertical plywood panel 29 in. high, 48 in. 
wide. The side of the panel viewed by S was 
painted gray and contained a plastic window 
24 in. high, 43 in. wide, centered in the panel 
11 in. above the table 
each side of the window, were used to provide 


[wo lights, one on 


reinforcement. Two push buttons were fixed 
to the table, one below each light. If S 
pushed either button, the light above it came 
on to signal a correct choice, provided that E 
had previously set a mercury switch on E’s 
side of the table. 

On E’s side of the panel a deck of stimulus 


FORMATION 581 


cards was pressed against the window by 
means of a drawbar and springs. Thus, when 
E removed the card appearing in the window, 
the next card was immediately revealed. 
Stimuli.—F or Ss in experimental groups the 
stimuli varied on two dimensions, form and 
number (of forms), one or the other of which 
was relevant at some time during the experi- 
ment for all Ss. In addition, the stimuli 
varied in color (all forms on any one stimulus 
card were either red or blue), 
that was always irrelevant. All stimuli were 
drawn with colored pencils on white 3 X 5 in. 
cards. Cards were inserted in plastic en- 


a dimension 


velopes. 

There were four values on the form dimen- 
sion (circle, square, hexagon, triangle), and 
four on the number dimension two, 
three, or four forms on a card). At any one 
time during the experiment S had to respond 
to just two of the values, on one of the 
dimensions, paired against each other, e.g., 
circle vs. square. 
were used: 


(one, 


Only the following pairs 
circle vs. square, hexagon vs 
triangle, one vs. three forms, two vs 
forms. 


four 


The training stimuli for the control group 
were vertical arrows, colored black, drawn on 
3 X 5 in. cards. These control stimuli also 
varied on two dimensions, each relevant for 


some Ss: direction (up-pointing or down 


rABLE 1 


STIMULI AND EXPERIMENTAL DESIGN 


Left 


Ci, C3 
C2, C4 
H1, H3 
H2, H4 


R (Reversal to same 
dimension 


1S, 1C 
28, 2C 
1H, 11 
2H, 21 


NRD (Nonreversal to 
different dimension) 


H2, H4 
Hi, H3 
C2, C4 
Ci, C3 


NRS (Nonreversal to 
same dimension 


UX, UZ 
XU, XD 
UX, UZ 
XU, XD 


Control 


Note C,S, H, T = 
or D = up-pointing or down-pointing arrow 
» reinforced stimuli or dimension 


ircle, square, hexagon, triangle; 1, 2, 3, 4 = number dimens 
¢ = short arrow, Z = tall arrow 
Left and right indicate responses 


Training 


Right 


$1, S3 
$2, S4 
Ti, T3 
T2, T4 


3S, 3C 
45, 4C 
3H, 31 
4H, 41 


Same as for Group R 


T2, T4 
Zi, 33 


Same as for Group R 


Same as for Group R 


on, number of forms on a 
Symbols in bold face print 
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pointing arrowhead), é i 1 I inappropriate control group 1ot corrects 
here was also a dimensio lat was for nonspecific transfer) would perform on tl 
ways irrelevant, width: an arrow is t transfer task 
bin. or $ in. wide rhere was only Subjects rhe Ss were stude 
on ¢ ich « ird psychology courses l 
Condition Che design i W ibl ips in turn. Each of the three 
Chere were three experimental gt ental groups (R, NRD, NRS) was 
ontrol group, all given differs training 32 Ss. Two Ss in Group NRD failed 
‘ r task \ i cri on on the training task 
fable 1, all l I t iced 
by putting together a pair of st | It soon became clear that for 
m one dimension and a pair from tl ther , discrimination of height of arrows 
mension were used, for different I much more difficult than discriminatior 


group, in both Linh id tr I rherefore Ss were 1 


discriminat ' ther circle \ squat 
hexagon vs. tria Group NRD 
trained Lito 
one Vs 
NRS w 
but w 
Group k 
reversed 
either ‘ ct { Vn-pointin r Left we 
irrow | 
All grou 
discrimin " 
lable 1 also shows tl for 1 cards 


possible « 


partial rein em I I 1 
was controlled. Whe I ! \ h wr control S Two 
| 


to the transfer task, the previ nfe so 16 cards i 
values on the number di mn) Wwe ( i different ord 
new valu would receiv irtial $ perm 


direction ol arrow 
is the first card shoy 
the second ( 


ther value 


id direc to 
j 


, | 1 j 
on one dimension | oy} \ 1es st card 


on the other dimension, it s decid t ( re ited S saw all po 
this same “deg ) ing ( il re values and dime 10 
experime al group | s¢ t rt lar task 1 
Pabl 1 l different ra 1dom 
Chere is one m import f f th t tl rder of pres 
design shown in 7] be seen t r ning 14 « ls, with the 
in both Group | in u NRS. 8 different stimulus cards | 
particular , iscriminati ; ble iny card w 
The i 
} require to press the left or right 
of some oO I 1 ranstet [Therefore ppearing in > window, 
when . Lining isk t | ou g correct, the 
(;rou NR ~ 1 ) } come on 


nterbs lor both training and tr 
each of these l t ks pre | ea required to reach a criterio 


of difficulty I I correct responses. If S had 


of nonspecific trans fror ior three presentatio 
other words, performan roups R d six presentations 


NRS during training i n sure Oo va timulus cards), S w is dropper 
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illowed to proceed at his own 


average, about 7 sec. elapsed 
cards 


between training 


sentation of successive 
was no interruption 


sfer tasks 


RESULTS 


Che = left 
summarizes performance on 


l raining. 
Table 2 


the 


portion ol 


training task as measured by 
number of trials to the criterion of six 
The 
cluded in 
he data. Although 32 Ss were taken 


successive correct responses 


six criterion trials are not 
the training task in 
Groups R, NRD, and NRS, 1 S in 
Group NRS and 1 S in NRD failed to 


reach criterion on 


to criterion On 


the transfer task. 
These 2 Ss were eliminated, and 1 S$ 
in Group R, with median performance 


eliminated to 
each of the 


train also 


l nil 
reduce V 


experimental groups 


g, Was 

to 31 in three 
here was no significant difference 
umong the three experimental groups 
the 
Hartley's test 


three lines in Table 2) on 
ing task (fF . 1). 
ited that the 
vere homogeneous (F 


> 


3,30). 


top 
trait 
variances 


. 1.87, 


ind group 


that it 
ecessary to run separate subgroups 
differential 
lifficulty of the dimensions of training 
stimuli. 


lt was noted earlier was 


n Group ( because of 


Che difference between the 
2) of the subgroup 
discriminated 
the 
height 


MWeali (see 


that direction (Group 
that 
( 


‘h) was 


d) and subgroup dis- 


criminated (Group 
4.06). 


When Group Ch was included with 


evhly significant (f 


the three 


experimental groups in 
ysis of variance of training means, 
is 3.22 (P < .05, df = 3/113). 
sy ¢ test, the mean for Group Ch 

ificantly 


the three experimental 


from the means 


differed sig 
of each ol 

the 59 level or less. 
ysis of variance of Group Cd and 


‘rimental groups vielded /<1. 


rABLE 2 


MEAN TRIALS TO CRITERION 


IN TRAINING 


will be 
sidered the more appropriate contro! 


Hereafter, Group Cd con- 
group. 
Transfer. 
on the transfer 
Table 2. Again, the means do not 
include the six criterion trials. Sines 
the variances of the exper! 
mental groups were 
| 503, f < 
distributions 
skewed, the scores were transform 


Mean trials to criterion 


task are shown in 


three 
heterogen¢ ous 
01), ¢ 


also 


nd since the 
were positivel: 
This eliminated 1 

variance 
duced approximately 
butions. 


to log (X + 1). 
heterogeneity of and pro 
normal distr 
Analysis of variance of th 
transformed scores of the experiment 
19.4 (P < .001 


test, Group Ix 
from 


groups gave F = 
df = 2/90). By t 
differed significantly 
NRD (¢ = 3.14), and from Group 
NRS (¢ = 3.32). Groups NRD and 
NRS_ also. differed © significantly 
(t = 6.46). The fact that Cond. kK 
was easier than Cond. NRD is in 
agreement with all previous studies 
that have made this comparison. The 
new finding is that Cond. NRS was 
easiest of all. 


Group 


Analysis of variance of transformed 
scores of experimental groups and 
Group Cd yielded F = 16.6 (P < .001, 
af = 3/113). By t test, Group Cd 
differed significantly from Group R 
(( = 2.42) and from Group NRD 
(f 5.35) but not from Group NRS 
(¢ <1). 
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TABLE 3 
MEAN ERROR RATIOS 
Training Transfer 
Group 


Mean Mean om 


2 ‘ 70 .020 
RX f A9 007 
A. ‘ 42 .029 
50 d 44 .022 
50 .66 .020 


Errors.—The number of trials on 
which S pressed the wrong button 
(errors), divided by the number of 
trials to criterion, was computed for 
each S. These error ratios for both 
training and transfer are summarized 
in Table 3. There were no significant 
differences among groups in training. 
Analysis of variance of the transfer 
data for experimental groups and 
Group Cd yielded F = 4.80 (P < .01, 
df = 3/113). By ¢ test, Group R 
differed significantly from Group 
NRD (¢ = 2.53), from Group NRS 
(¢ = 3.39), and from Group Cd 
(tf = 2.92). Other comparisons were 
not significant. 


DISCUSSION 


The data show that Group R, operat- 
ing under a negative transfer paradigm, 
did in fact show significant negative 
transfer when compared to a control 
group in which nonspecific transfer was 
controlled. Group R also showed the 
highest error ratio in transfer, another 
index of intertask interference. 

The need to control for nonspecific 
transfer in studies of this kind is indicated 
by the powerful effects such transfer had 
in the present study. Recall that for a 
group of Ss as a whole, the training task 
for both Groups R and NRS was iden- 
tical to the transfer task for all groups; 
therefore, performance of these groups 
in training yields a measure of difficulty 
of the transfer task for Ss not provided 
with training for nonspecific transfer. 
This measure was essentially the same 
for both Groups R and NRS (7.03 and 
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7.00 mean trials to criterion in training, 
respectively). Nonspecific transfer was 
presumably controlled in Group Cd, 
and this group required a mean of only 
2.67 trials (transfer mean) to learn the 
same task. It seems likely that in the 
studies of Buss (1953), Harrow and 
Friedman (1958), and Kendler and 
D’Amato (1955), the reversal groups 
would have shown negative transfer had 
the control groups been trained so as to 
minimize nonspecific transfer. 

Most of the data on reversal shifts in 
concept learning in human adults has 
been interpreted in terms of ‘‘mediating 
mechanisms” or “implicit cues’’ (Goss, 
1961; Harrow & Friedman, 1958; Kend- 
ler & D’Amato, 1955). The interpreta- 
tion of the present data, which follows, 
avoids this particular theoretical lan- 
guage. Instead, the interpretation is 
based largely on a single, and presumably 
fairly simple, assumption. 

Assume that Ss reinforced on a par- 
ticular dimension and extinguished on all 
other dimensions during training, tend 
to continue to respond, initially, to the 
reinforced training dimension during 
transfer. If so, then Group NRS 
(trained on forms, transferred to new 
forms) would have responded primarily 
to the two new forms on the transfer 
task. Since the two new forms would 
have had roughly equal probabilities of 
association with the two responses, the 
transfer task would essentially reduce 
to a simple two-choice discrimination for 
these Ss. Group NRS should, and did, 
learn the transfer task very rapidly. 


According to the same assumption, 
Group R (trained on forms, transferred 
to the same forms re-paired with the 
responses) should also have continued 
to respond to stimuli on the form dimen- 


sion early in transfer. But because the 
forms available to these Ss had been 
differentially reinforced in training, and 
were re-paired in transfer, initial prob- 
ability of association between the forms 
and responses would not be equal. 
Thus, although Group R was also faced, 
it is assumed, with only a two-choice 
discrimination in transfer, the training 


associations had to be_ extinguished 
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before the transfer task could be learned. 
Group R should, and did, transfer more 
slowly than Group NRS, and should 
make many And as has been 
shown, Group R should and did transfer 
more slowly than an appropriate control 
group. 


errors. 


Still following the basic assumption, 


Group NRD (trained on number, trans- 
ferred to forms) would continue to 
respond to the number dimension during 
transfer. Since no number stimuli, new 
or old, were consistently reinforced 
during transfer, the task for these Ss 
became quite difficult. First, responses 
to stimuli on the number dimension had 
to be extinguished. 
two dimensions, 


There now remained 
form and color, from 
which to choose; since both these dimen- 
had been extinguished during 
training, there was no basis for choosing 
between them. So Group NRD next 
had to discover that it 


sions 


was forms, not 
colors, that was being reinforced during 
transfer. Finally, these Ss had to dis- 
which form went with which 
It seems clear that the total 
number of alternatives from which to 
choose was greater for Group NRD than 
for any other group (Goss, 1961, has 


cover 


response. 


come to the same conclusion), and Group 
NRD showed the poorest performance of 
allin transfer. Viewed this way, it is not 
surprising that Group NRD should be 
inferior to Group Cd and Group NRS. 
But in this and in all previous studies 
that have made the comparison, Group 
NRD also learned the transfer task more 
slowly than even the negative transfer 
group (R). This finding simply shows 
that having to deal with several stimulus 
alternatives that have previously been 
subjected to differential reinforcement 
and extinction is more difficult than 
having to deal with a re-paired situation 
involving basically only two associations, 
a difference in task difficulty that would 
seem to have little theoretical import. 


SUMMARY 


In a study of human concept formation, 
two experimental groups were trained on a 
two-choice form discrimination, 
and color stimuli irrelevant. 


with number 
For one group 
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(reversal shift) the transfer task consisted of 
re-pairing the training stimuli with the re- 
sponses; for the other group (nonreversal 
shift to the same dimension), two new forms 
were used as transfer stimuli. A third ex- 
perimental group (nonreversal to a different 
dimension) was trained on number stimuli and 
transferred to forms. A control group was 
trained on stimuli differing from any of those 
used for experimental groups, then trans- 
ferred to forms. The same two-choice form 
discrimination, with number and _ color 
irrelevant, was used as the transfer task for 
all groups. 

The results showed three significantly 
different levels of performance (in terms of 
trials to learn) on the transfer task. In order 
of best to poorest performance, the levels 
were: (a) nonreversal to same dimension, 
and control; these groups did not differ, 
(b) reversal shift, and (c) nonreversal to 
different dimension. As compared to the 
control, the reversal group showed significant 
negative transfer. It was suggested that 
performance of all groups could largely be 
accounted for by a combination of two 
factors: nonspecific transfer, and a specific 
tendency to continue to respond in transfer 
to the dimension of 
training. 


stimuli reinforced i: 


REFERENCES 
Buss, A. H. 


and 


Rigidity as a function of reversal 

nonreversal shifts in the learning of 

successive discriminations. J. exp. Psychol., 
1953, 45, 75-81. 

Buss, A. H. Reversal and nonreversal shifts 
in concept formation with partial rein- 
forcement eliminated. J. exp. Psychol., 
1956, 52, 162-166. 

Goss, A. E. 
concept formation. 
68, 248-274. 

Harrow, M., & FrrepMAN, G. B. Comparing 
reversal and nonreversal shifts in concept 
formation with partial reinforcement con- 
trolled. J. exp. Psychol., 1958, 55, 592-598. 

KENDLER, H. H., & D'Amato, M. F. A 
comparison of reversal and nonreversal 
shifts in human concept formation behavior. 
J. exp. Psychol., 1955, 49, 165-174. 

KENDLER, H. H., & Mayzner, M. S., JR. 
Reversal and nonreversal shifts in 
sorting tests with two or four sorting 
categories. J. exp. Psychol., 1956, 51, 244- 
248. 

Porter, L., & Duncan, C. P. Negative 
transfer in verbal learning. J 
Psychol., 1953, 46, 61-64 


Verbal mediating responses and 
Psychol., Rev., 1961, 


card- 


py 
exp. 


(Received November 2, 1961) 





EFFECTS OF SECONDARY 


REINFORCEMENT SCHEDULES 


IN EXTINCTION ON CHILDREN’S RESPONDING ! 


N. A. MYERS 


University of 


Strong secondary reinforcement etf- 
fects have not been consistently 
demonstrated. Nor is there agree- 
ment regarding the appropriate ex- 
planatory concepts. In_ particular, 
doubt been cast upon the ex- 
planation of St (secondary reinforce- 
ment) in terms discrimination” 
between conditioning and extinction 
trials (Bitterman, Fedderson, & Tyler, 
1953). 


has 


of 


Support for the discrimination hypothesis 
comes from a study by Melching (1954) 
He presented two groups of rats with 50% 
ieutral stimulus (buzzer) in training and 
found no difference in extinction responding 
between the group given no buzz in extinction 
and the group given 100% buzz in extinction 
\ study by Myers (1960) presents negative 
evidence for the discrimination hypothesis 
She trained children, using tekens as_ po- 
tential secondary reinforcers, and found that 
of the two groups trained with 50° token, 
the group receiving 100°, token during ex- 
tinction made significantly more responses 
than the group receiving no tokens during 
extinction 


Resolution of the differences in the 
results Myers and Melching is 
difficult without further data. The 
studies differed in the species of S 
and in the type of neutral stimulus 
The present study was designed to 
provide a further test of the “‘dis- 
crimination” hypothesis with children 
as Ss (as in the Myers’ study) and the 
buzzer as reinforcer (as in Melching’s 
study). Furthermore, a low rate of 
presentation of both the primary and 
neutral stimuli has been used during 


of 


training, in accord with recent data 
This research was supported by 
National Institute of Mental 
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funds 
from Healt! 


(srant 


AND J. L. 


MYERS 


Vassachusetts 


on the effectiveness of such schedules 
in establishing secondary reinforcers 
(Fox & King, 1961; Zimmerman, 
1957, 1959). An _ even im- 
portant reason for using such sched 
ules is that the difference between the 
training rate and 100°, buzzer in 
extinction should be clearly 

than the difference in training rate and 
0% in extinction, yielding a better 
test the ‘“‘discrimination” 
pothesis than either the Myers or the 
Melching study. 


more 


of 


METHOD 


A pparatus 
a portable box designed to attract the 
of preschool children. On_ the 
painted a clown face, havin 
a push-button and a slot-tray 
mouth. M & M coated chocolate candy w 
dispensed through a tube to the mouth of 
clown, while a }-sec. buzz was heard fre 
the interior of the box. The E had 
and operated two silent knife | 
allowed administration of the 
reinforcement. 
ret orded 


he apparatus employed w 


front 


eyes, nose, 


acces 
s whi 
predetermined 
The number of responses w 
electri 


on an magnet counter 


mounted on the back of the box, 
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resp g each successive 
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attending kindergar 


between 
{1 mo 
ton, Massachusetts 
Each 
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SECONDARY REINFORCEMENT 


rABLE 1 


EXPERIMENT AND MEAN NUMBER 
OF EXTINCTION RESPONSES 
FOR Eacu Group 


DESIGN OF 


Extine 


Conditioning tien 
io 


Mean 
No. of 


Responses 


Group - - 


Candy | Buzzer Buzzer 
(%) | (%) (%) 
102.73 
55.80 
36.40 
72.80 
100 66.60 


100 


20 


on a small table and S was instructed to sit 
at the small chair in front of it. No written 
instructions were read to S, but E standard- 
ized the verbal instructions as much as 
possible. Attention was called to the clown 
face, especially the nose. The Ss were told 
that “something happens when you press his 
Let’s see what happens.” The E 
pressed the clown’s nose and thereby received 
a buzzandan M & M. The S was encouraged 
to try it also and was given one rewarded 
preliminary trial. He was then told he could 
stay and play the game “as long as you 
want.” The E then sat down behind the 
table, facing the open back of the box and S. 
Each S was reinforced according to a 
predetermined 20% reinforcement schedule 
which delivered a total of 15 M & M candies. 
Immediately following the fifteenth rein- 
forcement, the extinction period commenced; 
no candy reinforcement was administered, 
and each S was run until he stopped and 
indicated a desire to return to the classroom or 
until 5 min. had elapsed, at which time E 
terminated the session. One last candy was 
offered at the end of the extinction period. 
Design.—Eight boys and 7 
assigned randomly to each of four groups. 
Fifteen more children were assigned to a 
second control group, run after the others. 
Three E groups received a 4-sec. buzz every 
time a candy was received during training 
(20% reinforcement with M & M and buzz). 
They differed only with respect to extinction 
treatment: one group (100% buzz) received 
the buzz for every button press in extinction; 
one group (20% buzz) heard the buzz on 
approximately every fifth response, as in 
training; the third group (0% buzz) never 
heard the buzzer in extinction. A control 
group (C,) never received the buzzer either 
during training or extinction; they received 
20% reinforcement with M & M candy alone 


nose. 


girls were 


during training, and no reinforcement during 
extinction. A second control group (C2) also 
received 20°, reinforcement with M & M 
candy alone during training, but received the 
buzz for every button press in extinction. 
The design is presented in Table 1, along 
with the mean number of extinction responses 
for each group. 


RESULTS 


The mean numbers of responses, for 
successive minutes of extinction, for 
the five groups are presented in Fig. 1. 
An analysis of variance was performed 
on these data and yielded a significant 
difference between groups (F = 5.98, 
df = 4/70, P < .001). There was a 
significant decrease in responding for 
all groups over time (F = 60.49, 
df = 4/280, P < .001), but the 
Groups X Time interaction was not 
significant. 

Duncan’s multiple range test was 
applied to compare the groups with 
one another. All differences were 
significant at the .01 level except those 
between Es and C, (where P < .05), 
between Es and Cos, and between C, 
and Cs. 

DISCUSSION 


A simple discrimination explanation of 
S' effects (Melching, 1954) would predict 
greatest number of extinction responses 
from the 20% buzz group in this study, 
since the schedule of S' presentation 
during conditioning and extinction is 
identical, and therefore, the extinction 
period is discriminable the 
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Fic. 1. Mean number of responses for 
successive minutes of extinction. 
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former conditioning period than for any 
(1957, 1959) 


greatest 


other group. Zimmerman 


also would predict respons 
strength for the 20% buzz group, arguing 
that any S' value of the buzzer accrued 
during conditioning would be dissipated 
more slowly by more occasional pres- 
entation during extinction. However, 
the results quite clearly refute 
hypotheses: the 100% buzz group made 
almost many 


sponses as the 20% buzz group. 


these 


extinction re- 
And, 
when the 20% buzz group was compared 
with the primary control group, it was 
seen that the buzz presented 20% of the 
time did not operate to increase response 
strength above primary extinction level. 
Furthermore, the simple discrimination- 
generalization model would predict a 
higher level of extinction responding for 
the 0% buzz group than for the 100% 
buzz group, in this experiment, since the 
change from 20% buzz to 0% buzz is not 
as great as the change from 20% to 100% 
buzz, therefore not as discriminable, and 
conditioned responses should be gen- 
eralized more easily. Again, the results 
do not support this prediction: the 100% 
buzz group made almost three times as 
many extinction responses as the 0% 
buzz group. 

It appears that some 
supplementary reinforcing role of the 
buzzer stimulus, as suggested by Myers 
(1958) and Myers (1960) is needed to 
for the significantly greater 
number of responses made with 100% 
buzz presentation in extinction. It may 
be noted that the significant difference 
between the 100% buzz group and the 
novel-stimulus control group which also 
100% 
that the reinforcing effect is 
due to previous association with the 
candy. 

This discrimination 
assumes that response strength in ex 
tinction is a function of the difference i: 
percentage buzz training to ex 
In contrast to the Bitterman 
Melching approach, the sign of the diffe: 


twice as 


notion of a 


account 


received buzz in extinction is 


evidence 


modified model 


from 
tinction. 


ence is retained; increments in percent 
age buzz should yield more responses 
than no change, which in turn should 


A. MYERS AND J. L. 


MYERS 


result in more responses than decrements 
This prediction is clearly borne out in the 
present study. The only incompatibk 
finding is the significant difference be- 
tween Groups E, (20% buzz in condi- 
tioning and extinction) and C, (0% buzz 
in conditioning and extinction). The 
theory would predict no difference, as 
would also the Bitterman-Melching ap- 
proach. However, it should be noted 
that this difference was of considerably 
statistical significance than 

differences predicted by the theory. 


less 


those 


SUMMARY 


Kindergarten children were trained in a 
free operant situation with candy as a reward. 
\ group receiving 20° buzzer presentations 
in training, and shifted to 100°, buzzer in 
extinction responded significantly more than 
similarly trained shifted to 20% 
buzzer and 0% buzzer in extinction. This 
100% buzzer group also performed better 
than a group which was similarly extinguished 
but which had not experienced the buzzer in 
training. It was concluded that a secondary 
reward effect was demonstrated. 


groups 
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SIMULTANEOUS 


INDUCTION OF MULTIPLE ANCHOR 


EFFECTS IN THE JUDGMENT OF FORM! 


EDWARD D 


TURNER anp WILLIAM 


BEVAN 


Kansas State University 


[he traditional approach in psycho- 
physics has been to hold constant all 
properties of the stimuli to be judged 
except one and to plot responses as a 
function of this variable. Meanwhile, 
perhaps the most obvious character- 
istic of judgmental situations outside 
the laboratory is that stimuli to be 
judged vary among themselves on a 
number of dimensions. A 
solution to the problem of 
dimensionality 


recent 
multi- 
has have 
stimuli judged for similarity and to 
express 


been to 


these relationships as dis- 
tances in a Cartesian space (Torger- 
son, 1958). An alternative experi- 
mental when the stimulus 


dimensions can be identified, consists 


strategy, 


of limiting these dimensions to some 
small number greater than one, and 
allowing them to vary with reference 
to each other in certain pres¢ ribed 
ways. This not only allows for an 
assessment of the psychophysical rela- 
tionships involved but may provide 
some information on the processes of 
judgment. 


The 


the method of single stimuli. 


present experiment employs 
It differs 
from the usual application in several 
ways: the stimuli differ with respect 
to three different physical dimensions; 
variation on 


each dimension is in- 


dependent of variation on the other 
two; judgments, 


dimension, are 


and three one for 


each made following 


the presentation of each stimulus. 


This experiment wv 
Contract 5290 (01 


State Univer 


between Kan 
Research The 
indebted to Paula Oppy and Jo 


their help in the collection of dat 


Office of Naval 


The purpose of the experiment was 
to determine whether or not anchoring 
effects typically obtainable for stimuli 
varying on a single dimension (Wood- 
worth & Schlosberg, 1954) could also 
be obtained for multidimensionally 
varying stimuli.” 


METHOD 


The Ss were 30 female 
They were divided randomly into 
three groups of 10 ich 
am hor 


Subjects under- 


graduates 


Group A received 


stimuli which deviated 


om series 


if 
stimuli in size and shape but were of a medium 
L, 


lightness. Group B received anchors which 


deviated i olor it which were of 


1 medium size received chors 


which were deviant in color and size but of a 
shape judged all stimul 
ill thre 


The series 


ntermediate 
cluding anchors, on dimensions 


Stimuli stimuli consisted of 


each mounted o 
white (Crescent No. 100 
board, 31.5 X 22.5 cm., 
Gerbrands tachistoscope. The 
differed from each other 
} eat h of 


cle vrees of 


gray rectangular shapes, 
heavy illustration 
lor presentation in 

members 
ch that there 
$ different shapes, 4 


lightness 


series 
were 
izes and 4 
In order to keep the 
manageable length, 16 
ht- 
his 
16 possible size 
shape om } 
then superimposing on this, in Latin square 
fashion, the 4 degrees of lightne 
‘ ch color appeare 


stimulus series to 
combinations of shape, size, and color (lig 
I 


ness) were selected from the 64 possible. 
was done by arranging the 
combinations in a matrix and 
ss such that 
ind eat h 
Color-\ 

Munsell 


physi Li 


lin each colum1 
row only once The colors were 
grays No 7, 9, 10, a id 11 The r 
equivalents as wel other 
propertie of the st 


lable 1 


? What we here refer 
per onal communicatior 
predomit int timul 

devia t stim 
through I truction 
rather tha eries je 
ire more pote nt th in predominat t stim ili in 
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TABLE 1 


PHYSICAL PROPERTIES OF THE 16 
SERIES STIMULI 


S 
—_ Shape 


Uoorex. | (Length X Width in Cm. and | 
Cm.) Length X Width Ratio) 
| 3.75 X 3.75 (1:1) 
| 3.60 * 3.90 (1:1.08) 
3.45 X 4.05 (1:1.21) 


| 3.30 X 4.70 (1:1.27) 


Lightness 
(Munsell 
| Number) 


wn 


14 


nnd 


an 


| §.00 X 5.00 (1:1) 
4.80 X 5.20 (1:1.08) 
4.60 X 5.40 (1:1.21) 
4.40 X 5.60 (1:1.27) 


And > 
mann 


—, 


6.25 X 6.25 (1:1) 

| 6.00 X 6.50 (1:1.08) 

| 5.75 X 6.75 (1:1.21) 
5.50 X 7.00 (1:1.27) 


Anu 


mun 


7.50 X 7.50 (1:1) 

7.20 K 7.80 (1:1.08) 
6.90 * 8.10 (1:21.21) 
6.60 X 8.40 (1:1.27) 


uu 
ow 


i 
muon 


Two similar stimuli were used as anchors 
for each group. These represented extreme 
values on two dimensions and an intermediate 
value on the third (control) dimension. 
Group A received the size-shape anchors. 
These were two relatively large rectangles, 
96 cm.? in area, with a length by width ratio of 
1:1.50. (Their dimensions were 8 X 12 cm.) 
They had Munsell values of 5 and 5.5 and 
thus were intermediate grays. Group A, 
therefore, provided anchor data on size and 
shape and control data for color. Group B 
received the color-shape anchors. These were 
two black 1:1.50 rectangles of intermediate 
(4 X 6cm. and 5.0 X 7.5 cm.) size. Group B 
thus provided anchor data for the color and 
shape dimensions and control data for size. 
Group C received the color-size anchors, two 
large black rectangles, 9.6 K 10.4 cm. and 
9.2 X 10.8 cm. They were intermediate in 
shape (length to width ratios of 1:108 and 
1:21, respectively). Group C_ provided 
anchor data on color and size and control data 
on shape. 

Assuming the stimuli designated as anchors 
to be effective, it was expected that the size 
judgments for the anchored groups would be 
reliably smaller than those of the control, the 
shape judgments would shift toward greater 
squareness, and the color judgments toward 
greater lightness. 

Each S made a total of 72 judgments for 


each dimension: each of the 16 series members 
was presented 3 times and each of the two 
anchors 12 times. The order of presentation 
on the 72 trials was random. 

Proceaure-—The Ss were tested individ- 
ually. Presentations of stimuli were at 
intervals of 10 sec. for durations of .5 sec. 
The psychophysical method was the rating 
scale version of the absolute method. Ratings 
were required on all three dimensions for each 
stimulus presentation. Thirteen categories 
were available for each judgment: The shape 
categories varied from 0 (perfectly square) 
to 12 (extremely nonsquare). The size 
categories were —6 (extremely small) through 
0 (neither large nor small) to +6 (extremely 
large). The color categories varied from —6 
(very dark gray) through 0 (neutral gray) 
to +6 (very light gray). The Ss were also 
encouraged to use additional categories at 
either or both ends of any scale when they 
regarded this to be necessary to the expression 
of their judgments. The anchors were in no 
way identified by E as special stimuli. Each S 
recorded her judgments upon a mimeographed 
data sheet provided for her. Median judg- 
ments were computed for each S’s judgments 
of each individual stimulus. Means of these 
medians were then*used as cell entries in all 
analyses performed upon the data. 


RESULTS AND DISCUSSION 


Figure 1 summarizes the data for 
each dimension separately. The an- 
chor data the average 
judgments of the several series stimuli 
made by two groups; the control data 
derives from the judgments of the 
third group. Table 2 presents sum- 
maries of analyses of variance used to 
evaluate these data. Three separate 
analyses were performed, one on the 
data of each dimension. In the 
interest of simplicity of presentation, 
the within-Ss sources (Between Stim- 
uli, Stimuli &K Ss, etc.) have been col- 
lapsed, so that the summaries indicate 
differences between groups of Ss, each 
of whom is represented by one average 
judgment per dimension. Similarly, 
the Stimuli X Groups interaction is 
not identified. Meanwhile, the be- 
tween-groups source has been parti- 
tioned into predicted differences (an- 


consists of 
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Fic. 1. Average size, shape, and color anchor effects for the several multidimensional gré 
rhe solid line represents the anchor data, the dotted the control data he anchor curve 
size derives from the data of Groups A and C, its control from Group B. The shape a: 
curve represents data of Groups A and B; its control is Group C. The color data are obt 
from Groups B and C; its control is Group A.) 


rABLE 2 


SUMMARIES OF ANALYSES OF VARIANCE PERFORMED UPON THE JUDGMENTS FOR EACH 
OF THE THREE DIMENSIONS ON WHICH THE SERIES STIMULI VARIED 


Between groups 
Anchor (Groups A, C) vs. No 
Anchor (Group B) 
Group A vs. Group C 
Pooled between Ss 


Between groups 
Anchor (Groups A, B) vs. No 
Shape Anchor (Group C 
Group A vs. Group B 
Pooled between Ss 


Between groups 
Anchor (Groups B, ¢ 
Anchor (Group A 
Group B vs Group = 
Pooled between Ss 


05 ; two-tailed 
05; one-tailed 
001 ; one-taile 
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chor vs. no anchor) and nonpredicted 
differences. Since the direction of 
each possible anchor effect can be 
predicted (H,), a one-tailed criterion 
of significance is applied. At the same 
time there is no basis for expecting the 
anchor effect to be greater in one 
anchor group than the other (H,). 
Therefore, a two-tailed criterion is 
used in these cases. 

Inspection of Fig. 1 indicates the 
simultaneous induction of anchor 
effects for all three dimensions. In 
every case the solid line lies in the 
predicted relationship to the dotted. 
This is supported by the data of 
Table 2. The judgments of the con- 
trol group are significantly different 
from the judgments of the combined 
anchor groups for all dimensions. In 
no case, however, were there reliable 
differences between the two anchor 
groups. Further evidence that the 
differences between groups are anchor 
effects is indicated by the difference 
in slope between each pair of curves. 
When the anchor is above the series, 
as in the case of size and shape, the 
curves should be most widely sepa- 
rated at their upper end; when it is 
below, the separation should be great- 
est at the lower end. The data of 
Fig. 1 are in line with this expectation. 
Finally, it is interesting to note that 
simultaneous anchor effects may be 
either in the same or opposite direc- 
tions. In the case of Group A, which 
displayed size and shape anchor 
effects, both anchors were above the 
series and the judgmental shifts were 
downward. 


However, in Group B, 


AND WILLIAM BEVAN 

which displayed color and _ shape 
anchor effects, and Group C, which 
showed size and color effects, one 
anchor was above and the other below 
the series, and the anchor differences 
were in opposite directions. 

An incidental finding is the dip in 
the shape curves. It will be remem- 
bered that Shape 1 is a perfect square 
and Shapes 2, 3, and 4 are rectangles 
of increasingly greater width. Since 
squares tend to appear taller than 
they are wide (the horizontal-vertical 
illusion), it is not unreasonable that 
Shape 2 is judged more square than 
the square itself. 


SUMMARY 


The purpose of the present experiment was 
to determine if an anchor stimulus which 
differed from its psychophysical series on more 
than a single dimension could effect shifts in 
judgment on each of the dimensions on which 
it differed from the series. Accordingly, Ss 
were asked to judge a series of rectangular 
figures which varied in shape, size, and light- 
ness. Anchor stimuli which represented 
marked deviations from the series values on 
two but not on the third dimension were 
included in the order of presentation. Three 
groups of Ss were used so that all combina- 
tions of two dimensions were anchored with 
the third available for control data. Analyses 
of variance performed on the data for each 
of the three indicated that 
multiple anchoring had occurred. 


dimensions 
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DISCRIMINATION AND MEDIATED GENERALIZATION 


IN 
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When responses learned to a given 
stimulus occur in the presence of other 
stimuli which are not physically 
similar to the original, we have an 
example of secondary or mediated 
stimulus generalization. The gen- 
eralization in such cases appears to be 
based on previous experiences of the 
organism being studied. Some of the 
best evidence for mediated generaliza- 
tion comes from studies of what has 
been called semantic generalization, in 
which a response conditioned to a 
word generalizes other words 
similar in meaning to the original. 
Reviews of the experimental literature 
on semantic generalization may be 
found in Cofer and Foley (1942) and 
Osgood (1953). 

Behavior theorists have attempted 
to account for these phenomena by 
assuming that Ss make implicit re- 


to 


sponses preceding the overt response 
and that these implicit 
produce stimuli which partly deter- 


responses 


mine the overt response to the pre- 


sented stimulus. For example, in the 
case of generalization from one word 
to another word similar in meaning, 
it has been assumed that there are 
learned mediating responses to words 
which represent their meanings. The 
more nearly synonymous two words 
are, the greater is the similarity of 
these responses, i.e., the greater is the 
physical similarity between the pat- 
terns of stimulation produced by the 


1 This research was conducted at Indiana 
University while the author was a National 
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mediating responses. A_ response 
learned to a word will also be learned 
to the stimuli produced by the mediat- 
ing response representing the meaning 
of that word. Therefore, by physical 
similarity, the response will generalize 
to the stimuli produced by mediating 
responses to other words similar in 
meaning to the original, generating a 
semantic gradient of generalization 
(Osgood, 1953). 

This paper reports the results of an 
experiment designed to test a model 
for mediated generalization, developed 
within the framework statistical 
learning theory. The model specifies 
the assumed mediation process more 
precisely than has usually been the 
case, and yields quantitative predic- 
tions of the effects of mediation in a 
specific experimental situation. <A 
brief theoretical review will be given 
here; for a full account see Popper 
(1959). 

The mediation model is based on a 
model for discrimination learning 
developed by Burke and Estes (1957). 
Their model applies to discrimination 
problems which consist of a series of 
trials. Each trial is initiated by a 
stimulus to which S responds, and is 
terminated by a reinforcing event. 
A stimulus is conceptualized as a set 
of elements available for sampling by 
S, with each element conditioned to 
one and only one of the response 
alternatives in the situation. Each 
available element has a probability @ 
of being sampled on a particular trial. 
The probability of each response is 
equal to the proportion of sampled 
elements conditioned to that response. 


of 
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When a reinforcing event terminates 
a trial, all elements in the sample 
the 
corresponding to that event. 

The mediation model is an exten- 


become conditioned to response 


sion of the Burke and Estes approach 
to problems in which mediating 
responses are assumed to be occurring 
and influencing the final overt re- 


sponse. Specifically, it is assumed 


that mediating responses in the pres- 


ence of a stimulus produce cues which 
can be additional 
elements in the set corresponding to 
that stimulus. are 
therefore available for sampling when 
that stimulus is present, and if their 
conditioning status is known, their 
effect on the overt 
predicted. 


represented as 


These elements 


response can be 


In this experiment, Ss were given 
two successive probabilistic discrimi- 
nation problems. Their performance 
on a third problem was predicted on 
the assumption that it would be 
affected in a specified way by mediat- 
ing responses, resulting from the 
training on the two initial problems. 


METHOD 


Ihe first two probabilistic discrimination 
problems will be designated Discrimination a 
and Discrimination b, respectively. On each 
a, one of two stimuli, 
white light, appeared, 
followed by one of two reinforcing events, the 
letter X or the letter O. Immediately after the 
light appeared, S was to respond by saying 
ither X or O, to indicate which outcome he 
expected on that trial. The reinforcing events 
were probabilistically related to the stimuli, 
i.e., the probability of X or O on each trial 
depended only on the stimulus initiating the 
trial. 


trial of Discrimination 
a green light or a 


’ 


rhe Ss were then trained on Discrimina- 
tion b, which was another two-stimulus, two- 
The stimuli were X and O, 
ind the reinforcing events were two nonsense 
syllables, MAF and KUV. 
and b were related in 


response pr ot »lem 


Discriminations a 
that the reinforci 
events (and responses) of 


ng 
Discrimination a 
were the same as the stimuli of Discrimination 
b. Interspersed among the trials on Dis- 
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crimination b were a few trials on which the 
stimulus was a white light, 
as in Discrimination a, but the S was required 
to respond with MAF or KUV, as in Discrimina- 
tion b. 


a green light or 


No reinforcing event occurred on 
those trials, which will be referred to as test 
trials. 

Finally, a third problem, Discrimination c, 
was given in which the stimuli were the green 
and white lights, the reinforcing events were 
MAF and KUV, and MAF and Kuv each had 
probability .50 of occurring, regardless of 
which stimulus initiated the trial. 

The and reinforcing 
events in Discrimination a will be denoted by 
T, and Te, A; and Ag, and E; and Ez, re- 
spectively, where reinforcing event E; means 
reinforcement of response Aj. 


stimuli, responses, 


Similarly, the 
stimuli, responses, and reinforcing events in 
Discrimination b will be denoted by T 
ry, As and Ay, and E; and E34. 

Subjects —The Ss were 96 Indiana Uni- 
versity students taking the first semester of 
introductory psychology. They were as- 
signed randomly to experimental groups, and 
tested individually 

Apparatus.—A_ vertical black wooden 
board, 30 in. high and 36 in. wide, was sup- 
ported on a table 30 in. high. A diffusing 
screen made of a double layer of sanded 
Plexiglas, 4 in. high and 21 in. 
mounted on the board. 


; and 


wide, was 
hree inches below 
the center of the screen was a window of one- 
way mirrored glass, 2 in. in diameter, which 
became transparent only when lighted from 
behind. Another window of the same kind 
was below the first, with 3 in. between the 
centers of the two windows. A door on the 
back of the apparatus permitted the insertion 
of cards immediately behind the windows. 
lwo 6.3-v. pilot light assemblies were 
apart behind the Plexiglas 
Colored jewel caps covered the lights 
from a frontal view the left 
was green and the right light was white. Two 
.15-amp. incandescent bulbs were 
mounted behind each window, one on each 
side. A cam-operated timer controlled the 
time intervals during which the appropriate 
lights came on. 
A 5 X 8 in. 


windows on each 


mounted 12 in. 
screen. 
so that light 


6.3-v., 


index card was behind the 
trial. Some of the cards 
used had either X or O typed in pica capitals 
so that it would appear in the center of the 
upper window when illuminated, and either 
MAF or KUV, typed in pica capitals, so that it 
would appear in the center of the 
window when illuminated 


lower 
Other cards were 
blank in either the upper or lower position. 
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Procedure and experimental design.—Each 
S sat 3 ft. in front of the table which sup- 
ported the stimulus panel. The room was 
dark, except for a 100-w. bulb shining on the 
front of the windows. 

Instructions were given for Discrimination 
a, presenting it as a prediction experiment 
and emphasizing the importance of trying to 
make as many correct 
After 4 practice trials, 
trials on Discrimination a, 70 
trials. 

Immediately following this phase, Ss were 
given instructions for the remainder of the 
experiment: Discrimination b, with test trials 
interspersed, and Discrimination c. They 
were told that any one of four events could 
begin a trial: the green light or the white light, 
as before, or X or O. On every trial, they 
were to guess whether MAF or KUV would 
follow. They were told also that on some of 
the trials a blank card would follow their 
guesses, indicating that they were not being 
informed of the correct answer for those trials. 
An additional 98 trials 
tributed in the following way 


choices as possible 
Ss were given 140 


T,; and 70 1 


were given, dis- 
rials 6 and 17 
were unreinforced Discrimination b trials, i.e 
the stimuli were T; and T,, 
and no reinforcing event They 
were included so that Ss would have some 
experience with unreinforced trials prior to 
the test trials, and they have been omitted in 
all analyses. Trials 33, 45, 56, and 66 were 
test trials, with stimuli T; and Te and no 
reinforcing events. For half the Ss in each 
experimental subgroup, the stimuli appeared 
in the order T,-T.-T.-T;, and for the rest in 
the order T.-T;-T)-Ts. The remainder of the 
trials up to Trial 66 were reinforced Dis- 
crimination b trials, 30 7 Il, trials 
lrials 67-98 were Discrimination c trials, 
16 T, and 16 Ts trials. 

On each trial of the experiment, the stim- 
ulus appeared for 2 sec., followed immediately 
by the reinforcing event (or a blank white 
background) for 2 sec., and there was a 6-sec. 
intertrial interval. The complete 
mental session lasted about 45 min. 

The probability of reinforcing event §; 
following stimulus T, will be designated 7jj. 
The Ss were divided into two main experi- 
mental groups. On 
Group I, 


in random order, 
occurred. 


and 30 


experi- 


Discrimination a, for 
m1, was equal to .90 and ma was 
equal to .10, and for Group II, 71: was equal 
to 1.00 and x2; was equal to .50 
the different z 


Except for 
values in Discrimination a, 
both groups treated identically; for 
both, the Discrimination b values were z 
equal to 1.00 and 24; equal to .00 

lhe sequences of trials on Discrimination a 


were 


were randomized 
within each 


with the restriction that 
successive block of 20 trials, each 
combination of stimulus and reinforcing event 
was presented a number of times exactly equal 
to its expected number, considering the rein- 
forcement probabilities rhe 
Discrimination b were randomized 


s¢ que nces ol 
trials on 
with the same restriction within each succe 
sive block of 10 trials. On Discrimination c, 
the randomization was restricted in the same 
manner over the total set of 32 trials. On all 
problems, a different randomization was used 
for each S. The design was counterbalanced 
by having eight subgroups, with different 
identifications of the stimuli and reinforcing 
events, within each main experimental group, 
making a total of 16 subgroups, 6 Ss in each. 


RESULTS 


Discriminations a and b.—‘\he pro- 
portion of A; responses on T; trials, 
within a given block of trials, will be 
designated P(A;|T;). The changes i: 
P(A;\T,) and P(A,!|T-) for both 
groups, 20-trial blocks on Dis- 
10-trial blocks on 
illustrated in 
According to the Burke 
Estes discrimination model, the 


over 
crimination a and 
Discrimination b, 
Fig. 1 and 2. 


and 


are 


final mean probabilities of response A, 
given stimulus T, 
given stimulus T» 


and response A 
should both be be- 
tween .10 and .90 for Group |, and 
above .50 for Group II. However, 
the final P(A;|T-2) for Group II, .44, 
is significantly below .50 (¢ = 2.08, 


P <.05). In Group I, both final 


Fic. 1. Mean P(A,;!|T,) and P(A I 
20-trial blocks for Groups I and II 
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¥ 
P(A, |T,) @——e GROUP I 


o----0o GROUP I 


at ) 
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PROPORTI 








BLOCKS OF iO TRIALS 


Fic. 2. Mean P(A3|T3) and P(A3|T4) over 
10-trial blocks for Groups I and II. 


outside the theo- 
P(A,|T:) = .91, 
P(A,|T:2) = .06. Since P(A,|T,) and 
P(A2|T:2), ie., 1— P(A,|T2), are 
measures obtained under identical 
experimental conditions in Group I, 
they were combined in order to get an 
overall test of the deviation from 
theoretical bounds in that group. As 


proportions are 
retical limits 


the distribution of scores is highly 
skewed, there is no really adequate 
test for the statistical significance of 


the deviation from .90. The obtained 
deviation, .03, is 1.82 times its 
standard error. Furthermore, a / test 
of the difference between the propor- 
tions on Blocks 6 and 7 indicates a 
significant increase (¢ = 3.21, P <.01), 
suggesting that continued trials might 
have led to a larger discrepancy. 


TABLE 1 
PROPORTIONS OF A; RESPONSES OVER FourR- 
TRIAL BLOcKs ON DISCRIMINATION Cc 


Blocks 


Group 


Test trials—Two T, test trials and 
two T>» test trials were given in order 
to investigate the dependence among 
successive unreinforced responses. A 
preliminary study had shown that a 
series of unreinforced test trials did 
not give independent estimates of 
response probability, since most Ss 
adopted a consistent pattern, always 
making one response to one stimulus 
and the other response to the other 
stimulus. Chi square tests were used 
to investigate response dependence on 
these trials.2 The responses on the 
first T, test trial and the first T. test 
trial did not deviate significantly from 
independence, while responses on the 
second test trial with each stimulus 
were significantly dependent in the 
direction suggested by the preliminary 
study. Therefore, only the first T, 
test trial and the first T, test trial for 
each S were used in testing the 
predictions derived from the model. 

The observed proportions on 
these test trials were: For Group I, 
P(A;|T1) = .73 and P(A;|T2) = .27; 
for Group II, P(A,|T;) = .58 and 
P(A,|T:) = .44. The difference be- 
tween the two proportions was sig- 
nificant for Group I (x? = 12.97, 
P < .001), but not for Group II 
(x? = 1.16, P > .10). 

Discrimination c.—The 16 T, trials 
and the 16 T:. trials on Discrimination 
c were each divided into four 4-trial 
blocks, and the proportion of As; 
responses in each block was computed 
for Groups I and II. The results are 
given in Table 1. It had been ex- 
pected that, as training progressed on 
Discrimination c, P(A;3|T;) and 

*The model does not imply strict in- 
dependence of responses, but the expected 
degree of easily be 
would be so 
small that the hypotheses of strict independ- 


dependence cannot 


determined, and in any case 


ence provide a very close approximation to the 
predictions which could be derived. 
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P(A;3|T2) would change 
lo investigate changes, the difference 
the proportion of A, re- 
sponses on the first 8 trials and the 
last 8 trials was computed for each 
type of trial and each group. The 
significance of the differences was 
evaluated with ¢ tests, and only the 
difference for Group I on T, trials was 
significant (¢ = 2.94, P < .01). Since 
so little change occurred over the 32 
trials on Discrimination c, the propor- 
tions over all trials were used in 
further analyses. 


towards .50. 


between 


An analysis of variance was carried 
out, using the proportions of As; 
responses over the trials of Dis- 
crimination c, to determine the effects 
of group, subgroup, type of trial 
(T, or Ts), and the interactions among 
these. No effects even approached 
significance except type of trial 


(F = 14.76, P < .001). Individual ¢ 


tests indicated that the difference be- 


tween P(A;|T,) and P(A;|T:) was 


significant beyond the .02 level for 
each of the two experimental groups. 


DISCUSSION 


In specifically applying the mediation 
model to the experiment reported here, 
the assumed situation on the test trials 
will be discussed first. On a test trial, a 
stimulus from Discrimination a was pre- 
sented, but S was required to make one 
of the two responses learned in Dis- 
crimination b. It is assumed that, on the 
presentation of T, or Ts, 
implicitly with A, or Ag, 
conditioned to these stimuli in Discrimi- 
nation a. 


S responded 
the responses 


The probability of each was 
assumed to be equal to the probability of 
the same overt that 
stimulus, at the end of training on 
Discrimination a. It is assumed that the 
implicit response A, produced stimulus 
elements subset of the 
elements associated with the presence of 
Ts; in Discrimination b, and that the 
same relationship held for Az and T,. 


response, given 


which were a 


The probability with which these ele 
ments were conditioned to As or Aq was 
determined, therefore, by the training on 
Discrimination b. The predicted re- 
sponse probabilities on the test trials are 
therefore a function of the training on 
both Discriminations a and b. 

In Discrimination c, it is assumed that 
the initial probabilities of responses A, 
and A, were equal to their probabilities 
on the test trials, and that the probabili- 
ties would have gradually approached 
.50 as training progressed. No predic- 
tions about the rate of change can be 
derived from the model in its present 
form. 

For this experiment, the model implies 
that P(A;|T:) should be greater than 
P(A3|T:) for both groups. This is a 
result of the reinforcement probabilities 
on the discrimination problems. In the 
presence of T;, an implicit A; response 
should be more probable than an implicit 
Az response in both groups, because of 
the Discrimination a training. There- 
fore, with high probability, a subset of 
the elements associated with stimulus T; 
should become available for sampling on 
those trials, and those elements have a 
very high probability of being condi- 
tioned to Response A;, due to the Dis- 
crimination b training. As a conse- 
quence, A; should be the more frequent 
response on the T;, test trial, and Dis- 
crimination c trials. On the other hand, 
using the same reasoning, A; should be 
the less frequent response on the T;, test 
trial, and Discrimination c trials. 

This prediction was confirmed on both 
the test trials and Discrimination c trials, 
with only the difference on the test trials 
for Group II failing to reach statistical 
significance. This result indicates that a 
mediation process was occurring during 
the trials, the experiment 
designed to insure against any possibility 
that physical similarity could account 
for the generalization of responses from 
the stimuli of Discrimination b to those 
on the test trials and Discrimination c. 

The model implies further that there 
should have been a preponderance of A; 
responses on the test trials and Dis- 
crimination c trials for Group II. This is 


since was 
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because the average probability of an 


iinapolic it A, response across both types ol 


test trials and Discrimination c trials 


should have been greater than the 


probability of an implicit A» 
due to the asymmetrical reinforcement 


response, 
probabilities in Discrimination a. There- 
fore, the implicit response should have 
produced a subset of the elements of 
than half the time, 
making the response A; more likely than 
Ay. Specifically, the average of P(As;|T;) 
and P(A3|T:) should be greater than .50 
for Group II. The obtained averages, 
.51 on the test trials and .52 on the 
Discrimination c trials, are very close to 
.50, and their deviations from it do not 


stimulus Ts; more 


approach statistical significance. 

This failure of the model is reflected in 
deviations the quantitative 
On the test trials, both pro- 
portions for Group I, and P(A;|T-2) for 
Group II, with 
specific predictions, while P(A3|T, 
Group II was substantially the 
predicted value. For and 
tests relating to the quantitative predic- 
tions, see Popper (1959). 


from pre- 


dictions. 


the 
1) for 


were consistent 
below 
derivations 


Che final proportion of A; responses on 
T. trials for Group II was significantly 
the minimum value predictable 
from the Burke and Estes discrimination 
model. 


below 


That model implies furthermore 
that the asymptotic mean proportion of 
A, responses for Group II over both T, 
and T-. trials should have been .75. The 
obtained proportion on the last block, 
significantly below the _pre- 
dicted proportion (¢ = 3.56, P < .001). 
In an experiment performed by Estes 
and Burke (1955) testing the model, 
they used the same 7m values as those for 
Group II of the present experiment. 
Their observed mean proportion of A 
responses over the last block of trials was 
approximately .71 (as 


.70, was 


estimated from 
the published curves). 
P(A,|T,) and P(A,| Ts») consider- 
ably the surke 
experiment as compared with the present 


Thus, although 
were 
different in Estes and 
experiment, their mean value was almost 
the same in the two experiments, and was 
below the predicted value. 

Che results on 


observed Discrimina- 
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tion a trials deviated from the predictions 
the 
then, in a specific way: 


based on surke and Estes model, 
the theoretically 
more frequent response did not occur as 
frequently as predicted. The same 
description would apply to the deviation 
of the observed results on the test trials 
and Discrimination c from the predic- 
tions derived from the mediation model 
proposed here. While no explanation 
for the discrepancy is suggested by the 
results, the fact that both models err in 
the same way that 
sumptions common to them are inade- 
quate in this experimental context. Since 
all of the assumptions of the Burke and 
Estes discrimination learning model are 
incorporated into the mediation model, 


suggests some as- 


modification of the more general assump- 
tions of the discrimination model seems 


to hold the greatest promise for achieving 


a more adequate quantitative formula- 
tion of the mediated gen- 
eralization. 


yrocess of 
I 


SUMMARY 


This experiment was designed to test a 
quantitative model, based on statistical learn- 
ing theory, for mediated generalization. The 
Ss were given training on two discrimination 
problems (a and b). These problems con- 
sisted of a series of trials, each trial beginning 
with the appearance of one of two stimuli, 
with Ss required to guess on each trial which 
one of two possible outcomes would follow 
the presented stimulus. Each outcome had a 
prearranged probability of following each of 
the stimuli. 
related in 


Discriminations a and b were 
that 
Discrimination a were the stimuli with which 
trials began on Discrimination b. The Ss 
were then given Discrimination c, and their 
performance on it was predicted on the as- 
sumption that it would be 
specified manner by mediating responses 
resulting from the training on Discriminations 
a and b. Specifically, the trials of Dis- 
crimination c began with presentation of one 
of the stimuli from Discrimination a, with Ss 
required to guess which of the two outcomes 
used in Discrimination b would follow. The 
probabilities of their initial guesses in this 
case were predicted on the assumption that 
they would first respond covertly on the basis 
of the outcomes of Discrimination a, and that 
their covert response would produce internal 
stimulation similar to the corresponding stim- 


the possible outcomes on 


affected in a 
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ulus on Stimuli from the 


therefore 


Discrimination b 


covert responses would mediate 
learned in 
Discrimination b to Discrimination c. 

rhe results indicated a significant effect of 
the pretraining on the final problem, along the 
lines predicted from the model. The precise 
quantitative predictions were only partially 


confirmed. 


generalization ol the responses 


Ihe discrepancies between ob- 
served and predicted results were compared 
with discrepancies of a similar nature between 
observed discrimination problems 
and predictions based on a statistical model 


data on 
for discrimination learning. 
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SEMANTIC SATIATION AND PAIRED-ASSOCIATE 
LEARNING ! 


R. N. KANUNGO,? W. E. 


LAMBERT, ano S. M. MAUER 


McGill University 


The phenomenon of satiation has 
been described by Smith and Raygor 
(1956) as “the reduction in the 
effectiveness of a stimulus with con- 
tinued exposure.” Two different 
methods, have been used to produce 
the satiation effect on verbal stimuli. 
One involves the overt verbal repeti- 
tion of the stimulus while the other 
relies on prolonged visual exposure to 
the stimulus. The verbal 
effect also been observed in 
various ways. For instance, Basette 
and Warne (1919) reported lapses of 
the meaning of words following their 
verbal repetition, and, more recently, 
Lambert and Jakobovits (1960) re- 
ported measurable decrements in the 
intensity of semantic ratings of con- 
tinuously repeated words. 
prolonged visual exposure method, 
Smith and Raygor (1956) demon- 
strated that a word loses its familiarity 
in the that associational re- 
sponses to a stimulus word become 
uncommon. 


satiation 
has 


Using the 


sense 


The present studies explored the 
role of the satiation process in paired- 
associate learning. The main ques- 
tion considered whether the 
reduction of the meaning of words has 
a detrimental effect 


was 


on subsequent 


acquisition tasks involving those very 
words (Exp. I). 
that 


In view of the role 
meaning plays in the response 


1 This research was supported in part by 
the Canadian Defense Research Board, 
Grant 9401-10, and in part by a subvention 
to W. E. Lambert from the Carnegie 
poration of New York. 

2 Now at 
India. 


Cor- 


Colle vc, 


Ravenshaw 


position of the paired-associate tasks 
(Cieutat, Stockwell, & Noble, 1958), 
it was decided to administer the 
satiation treatment to response ele- 
ments of S-R pairs. 
ment (Exp. II) was performed to 
study the role of interpolated semantic 
satiation on the recall of responses of 
the paired associates. 


A second experi- 


EXPERIMENT | 
Method 


Subjects. 


The Ss 30 undergraduate 
1 


None had previously partic 


were 
students. ipated 
in a similar experiment. 
Material and apparatus. 
yllables and words as stimulus and response 


members 


| sing nonsense 


respectively, two lists of paired 
h contai ling eight pairs, 
syllables were chosen 
from Hull's list of less than 20° association 
Hilgard, 1958), and the response 
words were chosen on the basis of their high 
Thorndike & 
high connotative meaning 
& Suci, 1958). Each list 
was printed on a strip of paper in five different 
random i 
standard 


a sociates, eat were 


prepared. Nonsense 


value 
Irequency ol 


1944) their 


(Jenkins, ] 


usage Lorge, 
and 


Russell, 


suited to the 
procedure 


orders in a manner 
anticipation with 

rhe stimulus term alone was 

prese ited for 3 sec 


it the 


memory drum. 
. and immediately following 
stimulus-response pair was presented 
for 3 sec. Then followed the next stimulus 
exposed for 3 sec. and so on Che intertrial 
interval was 6 sec. 

Another eight words were chosen as con- 
trols on the same basis as described above for 
response words, except that each of them was 
made equal in length to a response word of the 
second list. These words were used as con- 
trols in the sense that they were not to enter 
into the learning task after they had been 
given treatment. Care was taken 
that the control were neither struc- 
turally nor related to the re- 
sponse words of the paired associates which 
were to be learned. 


satiation 
words 


semantically 


Irhree semantic differential scales (Good- 


600 





SEMANTIC 


Bad, Active-Passive, Strong-Weak) represent- 
ing the three major factors of connotative 
(Osgood, Suci, & Tannenbaum, 
1957) were used for measuring the intensity 
of semantic ratings of words. 


meaning 


Each paired- 
associate response word and control word was 
printed on a separate 3 X 5 in. index card. 
Each semantic scale was also printed on a 
separate card. All cards were placed in a 
Kardex folder so that E could expose them in 
a predetermined random order, one at a time, 
first a word, and then a semantic scale along 
which S gave his ratings of the immediately 
preceding word 

Procedure. 
dividually. 
first 
standard 


All 30 
Initially, 
paired-associate 


Ss were tested in- 
S was presented the 
list (List I) with 
for the anticipation 
procedure involving the use of a memory 
drum. Before the actual presentation of the 
list, S was made familiar with the anticipation 
procedure by a single presentation of two 
practice pairs. 


instructions 


Three consecutive successful anticipations 
were considered as the learning criterion. On 
the basis of their learning scores, Groups C 
(control) and E (experimental), equated for 
both trials and errors, were formed for the 
main stage of the experiment. lhere 
15 Ss in each group. 


were 


The main part occurred approximately 1 
wk. after each S’s initial testing. For each S 
of Group E, the normal semantic profile was 
obtained for each of the eight response words 
of the second paired-associate list (List II). 
The procedure was the same as that used by 
Lambert and Jakobovits (1960). Briefly, 
each word was exposed for 1 sec. and then S 
asked indicate the appropriate 


was to 


semantic placement by pointing to one of the 


seven positions on the semantic scale. Then, 
for the satiation treatment, each of the re- 
sponse words was again exposed for 1 sex 

and S was asked to repeat the word aloud 
for 15 rate of 3-4 
repetitions per sec. Immediately after the 
repetition, E exposed a semantic scale and S 
made his rating for the word. This procedure 
was repeated three times for each of the eight 
words, one time f The 
the presented in an 
order which maximized the separation of re- 
occurrence of a word and For Group 
C, however, the eight control words were 
used instead of the List II response words. 
S of Group tea iirst, 
obtal 


continuously sec., at a 


for each semantic scale. 


words and scales were 


i scale. 


From each the normal 
ned for each of the 
then satiation treatment 
was administered to these words. Thus the 


Ss of Group C were given exactly the same 


emantic pl hile wa 


control words, and 
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type of treatment as given to Group E, except 
that the eight words which were rated and 
satiated were not those to appear as response 
words in the paired-associate list. 

Immediately after the satiation treatment, 
each S of Groups E and C was presented the 
second paired-associate list on the memory 
drum with exactly the same instructions as 
given for learning List I. The same procedure 
and learning criterion as described for the 
initial stage were used again. 


Results 


Both the trial and the error meas- 
ures for learning of List | make it 
clear that Groups C and E were in fact 
equated for the main stage of the 
experiment. The mean number of 
trials to reach criterion for Group C 
was 10.20 (SD = 3.17), and for Group 
E was 10.07 (SD = 2.46). Likewise, 
the mean error scores for Groups C 
and E were 20.00 (SD = 12.58) and 
20.07 (SD = 9.86), respectively. 

An examination of Table 1 indicates 
that for Group C, the satiation treat- 
ment of the control words led to a 
significant decrement in their rated 
meaning. For Group E however, the 
meaning decrement does not quite 
reach (.05 r <= 30). 
A t test applied to the mean satiation 
scores of both groups revealed no 
reliable differential effect of the satia- 
tion treatment the two groups 
(t = .55). Since Groups C and E do 
not differ significantly with respect to 
their satiation scores, the data from 
both the groups were combined to see 
if the overall effect of satiation treat- 
ment is to reduce the meaning in- 
tensity of the words. The combined 
mean semantic rating scores presented 
in Table 1 show that the meaning 
decrement is significant (P < .01). 

The effect of satiation of response 


significance 


on 


words on the acquisition of the second 
paired-associate list is shown in Table 
2. Group C, given satiation treat 
ment for control words immediately 


before learning, was significantly su- 
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TABLE 1 


EFFECT OF SATIATION TREATMENT ON THE SEMANTIC PLACEMENT OF WoRDS 


Before Satiation | After Satiation Change 


Group 


Mean 
3.88 
4.47 
4.18 





| 


: | 
SDpirr. t 











4.20 
4.68 


1.68 0.52 


0.41 


2.26 ee 
1.94* 
1.90 


4.44 | 


| 
1.77 
» | 


*® Entries are average polarity scores per word over the sum of three semantic scales 


*05 <P < 10 
*P < 05 
P< O01. 


perior to Group E with respect to 
acquisition of the list. In terms of 
error scores the difference between the 
groups is significant bey d_ the 
.01 level, but in terms of trials to 
criterion, the difference is not reliable 
(65.< P < .10). 


Discussion 


Two general conclusions can be drawn 
from the results of the study. First, in 
support of the earlier findings of Lambert 
and Jakobovits (1960), the study shows 
that the overall effect of the satiation 
treatment of words is to reduce the 
intensity of their meaning. The reason 
for not obtaining a significant satiation 
effect in Group E, while Group C showed 
such an effect, is unclear. However, 
there is possibility. It will be 
observed that in Group E, the initial 
ratings of the response words are higher 


one 


TABLE 2 


EFFECT OF SATIATION TREATMENT OF 
RESPONSE WORDS ON THE LEARNING 
or PAIRED ASSOCIATES 


2 94*** 


* 05 < P}<*.10 
** P< O01. 


0.26 0.47 3$.30°°° 


than the initial ratings of the control 
words in Group C (see Table 1). Such 
higher semantic ratings imply greater 
polarization of judgments on the part 
of Ssin Group E. According to Osgood, 
Suci, and Tannenbaum (1957, pp. 155 ff.) 
polarization of judgments is an index of 
habit strength. Thus it would be ex- 
pected that in Group E the 
meaning” habit is stronger than the 
similar habit in Group C. Consequently, 
Group E would show stronger resistance 
than Group C to any semantic change as 
a result of satiation treatment. 

The second and most interesting find- 
ing is that satiation treatment applied to 
response words has a negative transfer 
effect on the later learning of a paired- 
associate list. Lambert and Jakobovits 
(1960) conceptualized ihe phenomenon 
of semantic satiation as ‘ 
form of reactive inhibition” 


“‘word- 


‘a cognitive 
and related 
it to Osgood’s theory of representational 
mediation processes. Their explanation 
could account for the superiority of 
Group C over Group E in _ paired- 
associate learning by assuming that 
reduction in the meaning of response 
members makes them more difficult to 
associate. However, the results can also 
be accounted for in terms of principles of 
associative learning. When a response 
member (R) is continuously repeated, 
the different associations elicited by the 
word (m components) may gradually 
extinguish whereas the R-R connection 
This 


instance where experimentally developed 


gets strengthened. could be an 


frequency of stimulation () may lead to 
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decrease in m. Decrease in meaning as a 
function of satiation treatment, there- 
fore, can be interpreted in terms of 
increasing S’s tendency to connect the 
word with itself rather than to any of its 
common associates. 

Thus the effect of satiation of response 
words on subsequent acquisition can be 
interpreted in terms of transfer from one 
learning situation to another. For the 
experimental group, the meaning of the 
response words decreased possibly be- 
cause of the formation of an association 
of the response word with itself which 
would produce an impairment in the 
subsequent learning of the paired associ- 
[he situation is analogous to 
developing R-R connections for the ex- 
perimental group where all the m com- 
ponents (“‘hooks” or associations) of R 
extinguished, and similarly X-X con- 
nections for the control group where all 
the m components of R un- 
affected before S-R learning. Extinction 
of m components of R before learning for 
the experimental group would explain 
the superiority of the control group. The 
importance of m components in verbal 
learning is well recognized (Noble, 1952). 

More recently Cieutat (1960) in trying 
to clarify some of the conflicting data 
concerning the locus of familiarization 
and its effect on paired-associate learn- 
ing, noted that, “familiarity only with 
the response member inhibits learning” 
(p. 274). It should be observed that his 
method of familiarization involved con- 
tinued visual 60 
similar to the prolonged visual exposure 
method of satiation. To explain his 
results he argues ‘‘that the monotony of 
continued visual presentation evokes an 
inhibiting influence” (p. 274). 


ates. 


remain 


presentation for sec. 


Another possible interpretation of the 


of 
prelearning 


makes 
The satiation 
treatment given to the response words 
reduced their meaning, possibly making 
them more alike semantically. 
would to intralist 
response competition for Group E 


present results 
similarity. 


use response 


If so, one 


expect find more 
than 
for Group C. An examination of errors 
that 61% of all 


are intralist intrusions in com- 


revealed 


Group E 


errors for 


SATIATION 603 


parison with 67% for Group C, a com- 
parison which rules out this 
pretation. 


inter- 


EXPERIMENT |] 


We were interested in extending 
this line of reasoning another 
aspect of verbal learning. The pres- 
ent study compared the effects of the 
satiation treatment on stimulus and 
response members of paired associates 
when the treatment was presented 
after the associates had been learned. 
In this case both stimulus and re- 
sponse members were meaningful 
words. Use was made of a simple 
retroactive inhibition design. During 
the original learning piuase, the S-R 
connections were established, while 
during the interpolated phase either 
stimulus (for one group) or response 
elements (for a second group) were 
given the satiation treatment, and 
finally recall of response elements was 
tested when stimuli were presented. 


to 


Method 


Subjects—The Ss were 52 university 
students. None had previously participated 
in an experiment of this type. 

Materials and apparatus.—Several quite 
different methodological procedures were em- 
ployed in Exp. II. Using meaningful words 
as stimulus and response members, a list of 12 
paired associates was prepared. The words 
were chosen on the basis of their high fre- 
quency of usage in print (Thorndike & Lorge, 
1944) and their high connotative meaning 
(Jenkins et al., 1958). Each of the 12 pairs 
was judged (by 12 students acting as judges) 
to have little or no immediate association be- 
tween its stimulus and response members. 

Each paired associate was printed on a 
separate 3 X 5 in. card. Further, each stim- 
ulus and response member was printed on a 
separate card. These cards were placed in a 
Kardex folder so that E could expose them in 
a predetermined random order. Each stim- 
ulus word was placed immediately before the 
paired associate to which it corresponded so 
that E could expose the stimulus-response 
pair after the exposure of the stimulus word in 
a reliably constant manner with . 
of delay. 


minimum 
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Three semantic scales 
semantic ratings. These w 
Active-Passive, Strong-Weak. 

Procedure.—The study used two test con- 
ditions, a “Stimulus condition’’ and a ‘“‘Re- 
sponse condition.” Each test condition was 
in the form of a retroactive inhibition para 
digm and was divided into three phases. 

Learning phase.—This phase was identical 
for both test conditions. Each 5 was given 
four trials, a complete trial consisting of the 
exposure, in a predetermined, random order, 
of each stimulus member of the paired associ- 
ates followed by the stimulus-response pair. 
Each stimulus member and each pair was 
exposed for 3 sec. and a 10-sec. delay was 
given between trials. 

After four learning trials Ss were assigned 
to either the Stimulus or Response condition 
depending on their learning efficiency, equat- 
ing the two groups paired-associate 
learning ability. 

Stimulus condition.—First, S’s normal 
semantic profiles for all 12 stimulus words 
were obtained. Each word was presented 
three times (for 1 sec. each time) for measure- 
ment on the three semantic scales. The words 
and scales were also presented in a _pre- 
determined randomized order. 

Each of the 12 stimulus words 


were 
Te 


used for 
Good-Bad, 


on 


was placed 
in one of two categories, Satiation Category 
(SC) or Nonsatiation Category (NSC). An 
attempt was made to group one half of the 
stimulus members of paired associates which 
had been learned by the fourth learning trial 
in SC, and the other half in NSC. Cases 
where odd numbers of associations had been 
learned were balanced through the total 
group. Further, one half of the stimulus 
members of paired associates which had not 
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been learned by the fourth learning trial were 
grouped in SC, the other half in NSC. 

Each word in SC was exposed for 1 sec. 
and Ss were asked to repeat the word aloud 
for 15 sec. at a rate of 2—3 repetitions per sec. 
Immediately after the continual repetition, 
Ss rated the word on one of the three semantic 
scales. Each word in NSC was exposed for 1 
and Ss rated it immediately after ex- 
posure. After the list had been subjected to 
this treatment once (each word in SC re- 
ceiving satiation treatment and being meas- 
ured on one scale, and each word in NSC 
merely measured on one scale) all words were 
then rated in the usual way on the remaining 
That is, each stimulus word was ex- 
posed for 1 sec. and then rated immediately 
on one of the two remaining scales. Note that 
the satiation treatment was only given once, 
before one of the semantic ratings, not before 
each rating as was the case in Exp. I. Initial 
and final semantic ratings were subsequently 
compared. 

Response condition.—The procedure for 
this condition was ideritical to that for the 
Stimulus condition except that the response 
rather than the stimulus members were 
grouped into SC or NSC categories and then 
given the satiation treatment. 

It can be seen from this procedure that 
words in SC and words in NSC were exposed 
an equal number of times to Ss. Furthermore, 
due to the equal division of the words belong- 
ing to correctly learned paired associates into 
SC and NSC in each test condition, a basis 
was established for comparing the effects of 
satiation and nonsatiation treatments on the 
recall of learned paired associates. Likewise, 
due to the division of the study into two test 


sec. 


scales. 


conditions, a basis was created for comparing 


TABLE 3 


AVERAGI 


First Rating 


Mean 


Stimulus 
Satiated 
Nonsatiated 


+.18 
5.09 


Response 
Satiated 7 | 3 } 
Nonsatiated 7 


16 


Note Twenty h of the 


°P O01 


CHANGE IN POLARITY Of} 
OVER THE SUM OF THRE! 


Mean 


1.66 


PAIRED-ASSOCIATE MEMBERS 


SCALES: Exp. II 


Second Rati 


test condition 
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rAl 


PREAI 


ATIATION 


ON THE RECALL Of 


> Words 


Response 
SC Words 
NSC Words 


the effect of satiation treatment gi 
stimulus and response words on their 1 
Recall stage This stage of the study 
identical for both test 
were shown each stimulus word for 3 se 
isked to recall the 


with it 


conditions 


response wt 


Results 


Table 3 presents the mean change 
in polarity and 
response words, respec tively. It can 
that in the re- 
duction in intensity of meaning as 
measured by the semantic differential 
is significant for words given satiation 
treatment (P < .01 for both stimulus 
words and response words). On the 
other hand, words not given satiation 
treatment showed 
mantic change. 


scores for stimulus 


be seen both cases 


no significant se- 


Table 4 presents the mean number 
of paired associates learned by the 
fourth trial. In the Stimulus condi- 
tion, an attempt made to ad- 
minister interpolated satiation treat- 
ment to half of the stimulus members 
of these learned paired 
and not to the other half. A similar 
attempt made in the 
sponse condition except that instead 
of stimulus members, the 
members of the learned paired asso- 
ciates received the interpolated treat- 


was 


associates 


was also Re- 


response 


MENT OF 


E 4 


Parrep-Associate Mem 


PAIRED ASSOCIATES 


Drop in Recall 


ments. An examination of Table 4 
reveals that such an attempt was 
successful. The number of 
correct responses on the recall trial 
after the interpolation treatments, 
also presented in Table 4, reveals how 
much interference resulted from the 
treatments 
of the words learned by the fourth 
trial. In the Stimulus condition, a 


mean 


satiation or no satiation 


mean drop of 1.27 in recall of responses 


of the learned paired associates is 
noticed when the stimulus members of 
paired given 
satiation treatment. But the mean 
drop in the response recall of the 
learned paired associates of which the 
stimulus members were in NSC is .58. 
The difference between these means is 
highly significant (P < .001). 

It can be seen that the mean drop 
in recall scores for learned 
which the 
members were given satiation treat- 


those associates are 


paired 
associates of response 
ment was .69 and the mean drop in 
recall for learned paired associates 
whose response members were in NSC 
is .81. The difference between these 
two means, ol course, was not sig- 
nificant. 

Some of the paired associates which 
were originally unavailable to Ss after 


four learning trials were available at 
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recall. it is difficult to speculate as to 
whether these paired associates were 
at an ‘“‘oscillation period’’ of avail- 
ability (Osgood, 1953, pp. 503-504) or 
whether they were learned during the 
fourth trial when the correct response 
to the stimulus was exposed, or 
whether they were somehow made 
available during the _ interpolated 
period. Whatever the source of 
learning, its pattern is consistent with 
the other results. Of the 30 paired 
associates unavailable after four trials 
in the stimulus condition which were 
subsequently available at recall (a 
total of 30 paired associates for the 
group) 19 were ones whose stimulus 
members were in NSC while only 
11 were ones whose stimulus members 
were in SC. Further, of the 25 paired 
associates unavailable after four trials 
in the Response condition which were 
available at recall, 16 were ones whose 
responses were in the NSC while 9 
were ones whose responses were in the 


SC. These observations clearly follow 
the trends established by the results 
presented in Table 4. 


Discussion 


The findings of Exp. II demonstrate 
that paired-associate connections can be 
retroactively disrupted if the connotative 
meanings of their stimulus members are 
satiated. However, associational bonds 
are not affected by satiating response 
members of already 
These results could 
terms of the associational 
interpretation of semantic satiation pre- 
sented earlier in connection with Exp. I. 
Here it is argued that continual repetition 
of a word (TABLE, TABLE, TABLE, etc.) 
would strengthen the tendency for the 
word TABLE to be made as a response to 
the stimulus word TABLE. Thus, if the 
interpolated satiation treatment involves 
formation of a positive reaction tendency 
or a word-word habit, then in the present 
experiment, the stimulus satiation condi- 


learned paired 


associates. be ex- 


plained in 
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tion can be considered analogous to the 
response variation retroaction paradigm, 
and retroactive 
expected 


interference would be 
1953, pp. 325 &.). 
On the other hand, the response satiation 
condition is analogous to the stimulus 
variation retroaction paradigm where 
retroactive facilitation is expected. The 
reason why retroactive facilitation could 
not be obtained in the response satiation 
condition must depend upon factors 
other than the formation of the word- 
word habit per se during the interpolated 
period. In view of the importance of 
meaning in the response positions of the 
paired-associate tasks, it seems logical 
to presume that the reduction in meaning 
of the response items during the inter- 
polated period might have counteracted 
the facilitating effect of the word-word 
habit. This explanation, however, leads 
to the theoretical expectation that the 
retroactive facilitation effect can be ob- 
tained after interpolated satiation treat- 
ment to the response items if one uses 
nonsense verbal units as responses. The 
findings of Exp. II when considered with 
the findings of Exp. I, make it clear that 
the effects of reduction of meaning of 
response items on the formation (as in 
Exp. I) or on the maintenance (as in 
Exp. IT) of associational bonds are always 
detrimental. 


(Osgood, 


SUMMARY 


The role of verbal satiation in paired- 
associate learning was investigated. Two 
groups of 15 Ss each were matched on the 
basis of their learning measures in an initial 
test using a paired-associate list. In the main 
test both the groups learned a second paired- 
associate list. But immediately before learn- 
ing, Group E (experimental) was given 
satiation treatment of the response members 
while Group C (control) was given similar 
treatment to words which were not response 
members. Results indicated that (a) the 
satiation treatment of words caused a decrease 
in their connotative meaning as measured on 
semantic scales and (b) Group E was slower in 
learning than Group C. 

In Exp. II, using a retroactive inhibition 
paradigm, the effect of satiation treatment 
of stimulus words on the recall of already 
learned paired associates was studied. Satia- 
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tion treatment resulted in significantly 
than did the 
satiation control treatment. ‘The interpolated 
words produced no 
significant effect on later recall. Satiation 
treatment given to both the stimulus and the 
response words resulted in a 


more 


retroactive interference 


neon 


satiation of re sponse 


significant 


reduction in the intensity of their meanings 


as measured by semantic differential scales. 
The terms of 
an associational interpretation of semantic 


satiation. 


results were discussed in 
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EFFECTS OF VISUAL AND VERBAL CUES 


ON LEARNING 


\ MOTOR SKILL! 


LAWRENCE KARLIN ann RUDOLF G. MORTIMER? 


New York University 


In the training of motor skills 
additional cues may be supplied that 
will not be present in the operational 
situation. Improvement during train- 
ing produced by such cues has fre- 
quently been found not to persist in 
subsequent tests in which these cues 
were not present. On the basis of such 
results, Miller (1953) has distin- 
guished between cues that tell S what 
to do next, which he labels “‘action 
feedback,”’ and cues that tell S what 
he should have done, which he labels 
“learning feedback.”” The same cue 
may function in varying degree both 
as action and as learning feedback. 
According to Miller, cues which 
function primarily as action feedback 
do not produce improvement in per- 
formance in tests from which they 
have been removed and they may 
even produce a decrement relatiye to 
control conditions. ‘ 

A study by Lincoln (1954) on the 
effects of different cues on learning to 
turn a crank at a specified rate is 
relevant to the distinction. 
One group was given verbal informa- 
tion on the amount and direction of 
the rate after 
training trial. A second group was 
given this information plus a con- 
tinuous visual cue during each train- 
ing trial which indicated instantane- 
ous rate error. 


al OVE 


average error each 


Both groups yielded 


1This research 
carried out 


was part of a program 
under contract with the United 
States Naval Training Device Center, Port 
Washington, New York and described in 
Technical Report: NAVTRADEVCEN 
$58-2. 

2 Now at Purdue University. 
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similar learning curves but in criterion 
(retention) tests in which only the 
intrinsic kinesthetic cues remained, 
the verbal group did significantly 
better than the verbal-visual group. 

These results may mean that the 
visual cue functioning as action feed- 
back was a useful guide to perform- 
ance but did not promote learning to 
use the intrinsic kinesthetic cues. The 
verbal cue when used alone may have 
functioned as learning feedback but 
in combination, the visual cue func- 
tioning as action feedback was so 
much more available that it 
mized the use of the verbal cue. 

Continuing along these lines a study 
by Karlin (1960) investigated the 
effects of visual, auditory, kinesthetic, 
and verbal error cues, both singly and 
in a number of combinations, on per- 
formance of a task similar to that used 
by Lincoln (1954). It was found that 
a combined visual and verbal cue was 
consistently but not significantly su- 
perior to a verbal cue in learning, and 
equally good in retention. 

These results did not agree with 
those obtained by Lincoln, and sug- 
gested that certain differences between 
the experimental conditions might be 
important. 


mini- 


Thus while the verbal cue 
used in both studies was the same, the 
visual cue was continuous in Lincoln’s 
study and discrete in Karlin’s study. 
It is possible that the failure to find 
similar results in retention was due to 
the fact that the discrete visual cue 
was less informative than 
continuous visual 


Lincoln’s 


cue. One of the 


objectives of the present investigation 


(Karlin & Mortimer, 1961) was to 
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check this possibility by using a con- 
tinuous visual cue both alone and in 
combination with a verbal cue. 

In order to gain further knowledge 
concerning the mode of action and 
differential effectiveness of the visual 
and verbal cues, a scoring system was 
used by which performance could be 
evaluated in terms of both constant 
and variable errors. It was felt that 
this technique would prove valuable 
in determining the underlying effects 
of feedback. Lincoln also had scored 
performance for both constant and 
variable errors but he did not obtain 
significant results for the variable 
errors. Variable errors were measured 
by the number of times S passed in 
and out of the tolerance range. This 
method of measuring variable error is 
contaminated with constant error 
factors which Lincoln may have dis- 
regarded felt that the 
variable would be practically 


because he 
error 


important when measured by its effect 


on a total accuracy score only when 
the constant 
small. 


error was relatively 
In the present study the ap- 
paratus was specifically designed to 
yield constant and variable error 
scores which were independent of each 


other. 


METHOD 


Subjects.—The Ss were 45 paid, volunteer, 
right-handed male college students. 

A pparatus.—Except for the use of a con- 
tinuous visual display the 
basically the same 
(1960), where 


may be found 


apparatus was 
as that used by Karlin 
a more detailed description 

Essentially, the apparatus 
consisted of a crank handle 1 in. in diameter 
and 5 in. long, masked from S’s view, which 
turned on a mainshaft at a diameter of 7 in 
Connected to the mainshaft Weston 
tachometer generator output was fed 
into the electronic 


was a 
whose 
scoring system and into 
the display meter 

Ihe scoring system con ted of 15 channel 
in which high speed counters cumulated th 
time that S was turning at a rate within the 
range that defined each channel 
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The display consisted of a Triplett volt- 
meter carrying a translucent 4 X 2 in. scale, 
illuminated from the rear, and graduated into 
50 units with a center zero marking. The 
meter was mounted in a vertical panel 22 in. 
in front of S. The meter responded to th 
output of the tachometer generator (which 
had a linear response function) such that at 
99 rpm the meter needle would be at the 
center of the scale as indicated by the zero 
mark, 

Procedure.—The Ss were seated in front 
of the crank assembly and grasped the crank 
with the right hand. The task was to learn 
to turn the crank at 99 rpm. 

The Ss were randomly assigned to the 
visual, verbal-visual, and verbal cue condi- 
tions, 15 Ss per condition. Those receiving 
the visual cue were instructed in the use of the 
display meter. Those receiving the verbal 
cue were informed, at the end of a trial, of the 
amount and direction of the mean rate error 
in rpm. The verbal-visual group was given 
both types of cue. During retention trials the 
feedback cues were removed. 

The Ss were tested on 2 consecutive days 
On Day 1 Ss received 3 practice trials without 
feedback, 25 learning trials with feedback, 15 
immediate retention trials, and 10 relearning 
trials with feedback. On Day 2 they received 
15 delayed retention trials and 15 relearni: 
trials with feedback. The first session lasted 
approximately 50 min. and the second session, 
which took place about 24 hr. later, lasted 
30 min. 


A buzzer was used to indicate the beginning 
of a trial and 3 sec. after S began to turn the 
crank scoring was begun. At the end of a 
further 15 sec. a Hunter timer broke the 
scoring circuit and the buzzer was sounded to 
inform S of the end of a trial. The intertrial 
interval was 30 sec. The interval between 
blocks of trials was about 2 min. All Ss wore 
headphones to muffle outside noise. Masking 
noise was provided by a fan. 


RESULTS 


Three measures of performance 
were obtained for each S on each trial 
as follows: (a) Total time (sec.) that 
S turned at a rate within the tolerance 
range of +13.5 rpm. (6) Constant 
error (rpm); i.e., the arithmetic mean 
of the rate which gave 
amount by which S was 
turning too slow or too fast on each 


errors, the 


average 
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Fic. 1. Mean time within tolerance range by trial and cue (N = 15 each cue condition). 


rABLE 1 


DUNCAN RANGE TESTS OF MEAN DIFFERENCES BETWEEN CUES WITHIN BLOCKS 
OF TRIALS FOR DIFFERENT SCORING TECHNIQUES 


Immediate Delaye 


Learning Retsation Relearning I Retentic Relearning II 


Comparison 


P D P D Pp D 


Time Within Scoring Tolerance in Seconds 


Verb: Verb-Vis |—2. 01 | 1.87| ms |—2.83| .01 3. -2.51| .01 
Verb: Vis : j 95 ns |—1.97 01 5.77 | —1.94; 01 
Verb-Vis: Vis s j ns 86! ns F 57] ms 


CE in rpm 
Verb: Verb-Vis F 5. 01 2.07 
Verb: Vis , 05 |—5.€ 1 | 1.07 
Verb-Vis: Vis ‘ 05 .07 ns |—1.00 


in rpm 
Verb: Verb-Vis | 3. . 4 4.59 


Verb: Vis ‘ j A 3.26 | 
Verb-Vis: Vis —1. ; -2. -1.33 
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trial and was the figure used for the 
verbal cue given at the end of each 
trial. (c) The SD of the rate (rpm); 
i.e., the deviation of each of the mid- 
points of the 15 rate intervals from S’s 
mean turning rate during that trial 
was weighted by the time recorded 
for that interval, and the SD was then 
computed as the root mean square 
ol these time-weighted deviations. 
Figure 1 shows the results obtained 
for the three groups when total time 
within the tolerance range was aver- 
aged over all Ss within a group for 
each trial. In the learning and re- 
learning trials the scores of the verbal 
On 
the other hand, the verbal group did 


group are consistently poorest. 


best in immediate and delayed reten- 
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tion. 
the 
based on the last 
block using a within-groups error 
term with 42 df, yielded Fs significant 
at the .01 level for all blocks except 
immediate retention, which yielded 
insignificant results (F = 1.99, 
P > .05). More detailed evaluation 
of these data using Duncan range 
tests (Edwards, 1960) are given in the 
first section of Table 1. This table 
shows that while the verbal-visual 
group was consistently superior to the 
visual group in all blocks of trials, 
none of these differences was sig- 
nificant. It is worth noting, however, 
that the largest difference was ob- 
tained for delayed retention in which 


A simple variance analysis ol 


differences between conditions 


five trials in each 
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the trends show a tendency to diverge. 

When total performance was anal- 
yzed into constant and variable error 
components and averaged over all Ss 
within each condition, the correspond- 


ing trends shown in Fig. 2 and 3, 
respectively, were obtained. The 
constant error trends show consider- 
able similarity to the trends of Fig. 1. 
In addition, the Duncan range tests 
shown in the second section of Table 1 
yielded significant differences in im- 
mediate retention. The variance 
analyses for constant error are not 
shown, but they all yield F ratios 
which are significant at better than 
the .01 level. On the other hand, the 
trends shown in Fig. 3 for variable 
error are strikingly different from 
those of Fig. 1 and 2. Now the verbal- 
visual and visual groups consistently 
yield lower variable errors than the 


-—~ VISUAL 
——~ VERBAL 
“——* VERBAL-VISUAL 
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Mean rate variability (SD) by trial and cue (N = 15 each cue condition). 


verbal group. While the differences 
between the visual and verbal groups 
are not as large they all favor the 
visual group and, with the exception 
of immediate retention, the Duncan 
range tests are alli significant as shown 
in the third section of Table 1. With 
the exception of delayed retention 
which was nearly significant at the .05 
level (F = 3.15), a variance analysis 
of each block yielded F ratios signifi- 
cant at the .01 level or better. 


DISCUSSION 


When given in terms of time within 
scoring tolerance, the learning 
differences disagree with those obtained 
by Lincoln (1954) and agree with those 
obtained by Karlin (1960). Possibly the 
disagreement is due to differences in the 
characteristics of the display although 
the present display differed appreciably 


score 
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from those used in both of the above 
studies. 

these results 
agree Lincoln’s findings that the 
verbal cue was superior to the verbal- 
visual cue, although in the present study 
this difference was not significant. 

When performance is further analyzed 
into constant and variable error com- 
ponents, differences among the condi- 
tions are more pronounced. Considering 
the constant error first, the verbal cue is 
significantly superior to’the verbal-visual 
cue in retention and significantly inferior 
to this cue in learning. A similar picture 
is obtained when the verbal cue is com- 
pared to the visual cue although the 
differences in the learning and relearning 
Significant differ- 
ences are also obtained (with one excep- 
tion) when performance is analyzed in 
terms of variable error but this time the 
verbal inferior to both verbal- 
visual and visual cues in immediate and 
delayed retention and in learning. 

From this analysis it is clear that when 
performance is measured by total score, 
the superiority of the verbal 
retention is a result of 
constant rather than on variable errors 


In immediate retention 
with 


trials are not so great. 


cue is 


cue in 
its effect on 
This conclusion is reasonable since the 
verbal cue provided information directly 
determined by the constant error for a 
trial. On the other hand, the 
visual cue provided an immediate index 
of performance which did not distinguish 
between and 
since it did not average over time. 


given 


constant variable errors 
How- 
ever, the results suggest something that 
not those of 
experiments, namely, that with “‘action”’ 
Miller, 1953) like the 
or verbal-visual cues something is learned 
that 


persists even after the 


was apparent in earlier 


cues (see visual 


reduces rate variability which 


cues are with- 


drawn. Apparently, the visual cue leads 
to a relatively stable improvement in 
smoothness and steadiness of perform- 
On this point note that the verbal 
visual cue is superior both in learning and 
the visual These 


that ne two may 


ance. 
retention to cue. 
results 
interact to 
during retention than the visual cue alone 


suggest cues 


produce greater steadiness 
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but since the results are not statistically 
significant further work on this question 
is needed. 

It is worth noting that the verbal cue 
in Lincoln’s as well as in the present 
study did not give variable error in- 
formation and one may speculate on how 
a verbal cue which gave an average of the 
variable errors at the end of a trial would 
affect performance. 

On the whole the results of the present 
study show that cues which might 
ordinarily be considered to function as 
action feedback, or as a “crutch” to 
guide performance, can make a contribu 
tion to retention of a motor skill by way 
of reducing variable error, although this 
contribution can be obscured if only the 
other 
hand the results support Lincoln’s finding 
that the visual cue 
rance’’ when combined 


total score is considered. On the 


can be a “hind 
with the verbal 
cue as far as the constant error compo 
of the 

Finally, it is important to note that 
the present results are based on a type 


nent total score is concerned. 


of task which involves producing a singl 
steady state. In this they 


have a bearing on other types of task 


sense may 
which involve a single production such 
as the line-drawing tasks of Thorndike 
(1932). 
clusion is to be found in a recent study 
by Baker and Lavery (1960) who used a 


Further evidence for this con 


series of tasks requiring a single end 


product. 


SUMMARY 


The effects of visual, verbal, and combine« 
verbal-visual cues on the learning and rete 
tion of a crank-turning task were investig 
The task was to turn the crank at 99 rpm 
The Ss were 45 right-handed mak 15 Ss i 
each condition. 

It was found that: (a2) Overall s 


iperiority 
in retention tests of task performance m« 

ured by time within tolerance 
to reduction of 
verbal cue was inferior 


was due mati 
constant errors 

in le rning b it 
in retention t¢ when 


1 by time 


p riorma 


] 


measure withi toler 


magnitude of constant error ( Lhe bal 


visual and combi ied 


both 


cue was inferior to the 


verbal-visual cues during learning and 
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retention trials when variable 


measured. 


error was 
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PREDICTION OF SOME STOCHASTIC EVENTS: 
A REGRET EQUALIZATION MODEL ! 


MAX S. SCHOEFFLER 


Bell Telephone Laboratories, Incorporated, Murray Hill, New Jersey 


The present paper describes a series 
of experiments which were performed 
to develop some data on human pre- 
diction capabilities, a model for how 
such prediction capabilities are ex- 
pressed as behavior, and a series of 
tests of this model. 

The data were collected in a rather 
simple experimental situation. 
were asked 
successive 


People 
to make “bids’’ on 100 
numbers that appeared. 
Either they were instructed that these 
numbers referred to “make-believe 
dollars’ (MBDs) which they might 
win, or they were told only to guess as 
close as possible to the next number. 
For groups that bid for MBDs, the 
number of MBDs S received depended 
on the relationship between his bid 
and the number that actually occurred 
(and thus on the adequacy with which 
he was able to predict the upcoming 
number). 

All of the experiments to be re- 
ported involved 
situation. 


here this general 
Two of these experiments 
provided the intuitive basis for con- 
structing a model of behavior under 
these circumstances. The remaining 
experiments were then used to evalu- 
ate the adequacy of the model when 
the assumptions made in constructing 
the model were specifically tested for 
generality under conditions to which 
the model seemed applicable. 


One _ variable 
concerned 


that 
the spec ific 
tween the bid that 


was investigated 
relationship be- 


was made and the 


! The research reported in this paper was 


done at the Willow 
versity of Michigan, under a contract with 
the Department of the Army. 


Run Laboratories, Uni- 
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number that came up. It was termed 
the payoff variable and was varied for 
different groups in order to provide one 
test of the adequacy of model. 

The condition henceforth labeled 
Guess is the one characterized above as 
not receiving MBDs. For the Non- 
punish condition, the payoff to S was 


if B<N 
if B>wN 


where B is the bid made by S, 
number that appeared (input), and P 
is the payoff to Sin MBDs. That is, S 
MBDs as he had bid so 
long as his bid was less than or equal to 
the input number. If his bid exceeded 
the input number, he won nothing. 
Similarly, for the Punish condition, 


N is the 


won as many 


) B if 
| —B if 


— B< N 
B>wN 
Thus the punish condition differed from 
the Nonpunish condition only in that a 
lost the amount that he bid in 
case of an overbid, rather than simply 
receiving nothing. 
Make-believe dollars 
cause some pilot work 
they would function 
incentives. In an attempt to retain a 
linear value scale for the MBDs, only a 
relatively small values 
used. To the extent that the value 
remained linear with number of MBDs, 
the paradigm permitted a 
specification (in this arbitrary unit of 
measurement) of the payoff matrix 
relating each response (B;, i = 1,..., m) 
to each experimental outcome (N;, 7 = 1, 
k) in terms of the payoff, Pj, 
(positive or negative) to S. For the 
present study involving predicted num- 
bers as responses, B; = i and N; 
The model that 


person 


used be- 
indicated that 
adequately as 


were 


range of was 


reasonable 


was devised 
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rABLE 1 


INpuT DISTRIBUTIONS FOR THI 


(suess 
Guess 
Guess 
Nonpunish 
Nonpunish 
Nonpunish 
Punish 


Punish 


* Input distribution Constant (( 
scribe this situation treats 
“regret’’ as a central concept. Regret is 
defined as the difference between the 
payoff on a trial and the maximum 
possible. That is, if Pi; represents the 
payoff to S given response B; and event 
N;, then the regret experienced by S is 
Ri; =| Pi; — max P,;|. 


behavior in 


Although for the guess condition Pj; 
is not explicitly defined, a concept of 
regret still appears intuitively mean- 
ingful. This regret is considered to be 
the difference between the predicted 
number and the actual number. Form- 
ally, Rij =|B; — Nj}. 

Under this definition, the regret out- 
comes of a trial must take on positive 
values. However, since the experimen: 
under consideration deals with ordered 
outcomes and the responses can likewise 
be ordered, it is reasonable to consider 
separately a regret due to overbidding 
and one due to underbidding. That is to 
say, since the values of B; and N; have 
been, respectively, identified with the 
numbers represented by 7 then 


is due 


and i, 
to overbidding if 7 > 7, 
to underbidding if 7 > z, 
ift =}. 


Rj; is due 


and is zero 


The regret equalization hypothesis to be 


VARIOUS EXPERIMENTAI 


Groups 


proposed as a model, involves the notio1 
that regret due to overbidding and regret 
due to underbidding result, respectively, 
in tendencies to lower or raise the re- 
sponse. That is, if for example, Rj;(n 
is the regret on Trial m and 7 > j, then 
the response on Trial m + 1 tends to be 
less than 7; if i < 4, then the response on 
n-+ 1 tends to be greater than z. If 
another assumption is made, viz., that 
the amount of this effect is linear with 
R;;, it seems reasonable to expect be- 
havior to stabilize (if indeed it 
stabilize) at a point where the expected 
value of the regret due to overbidding is 
equal to the expected value of the regret 
due to underbidding. 

More formally stated, the 
equalization hypothesis predicts an 
asymptotic bid such 
satisfies 


does 


regret 


+hat 4 
Lilt v 


level b 


> p)Roj = E (Rij) 


a 


where [b] is the smallest integet b and 
p; is the probability that N; occurs. 


It is clear that b is not necessarily an 
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integer, whereas the Nj; are integers 


Thus, 6 cannot represent a constant 


terminal response level for an individual. 
Rather it must be an average taken at 
least 


over several 


dividual. 


responses of an in 
Further, the same asymptote 
is predicted for all individuals facing the 
same events. No doubt 
individual differences do exist, but for 
the sake of a model with maximal 
simplicity and intuitive appeal, it was 
deemed proper to avoid introducing a 
parameter to deal with individual differ- 
ences. Rather an attempt will be made 
to evaluate the predictions 
fitted parameters. 


sequence of 


using no 
(This is perhaps an 
overstatement since the hypothesis was 
constructed using the data from two of 
the groups. Thus, it may be argued that 
the transformation from MBDs to regret 


in itself constitutes fitting a parameter.) 


METHOD 


\ll experiments used groups of between 13 
24 college students whic h 
classes in 


psychology 


and constituted 


freshman, sophomore, or junior 


courses The 


all conducted 


experime nts 


in a similar fashio 


were 
ind required 
S predict on 100 successive trials the 

E wrote on the blackboard on 
trials The in 


entiated the three payor tur 


those differ- 
used 

the Punish and Nonpunish groups, Ss 

instructed that they get 100 

ssive Opportunities to request between 

nd 30 make MBDs). On 

to wri low! mount 


} 
ea 


would 


belie ve 
1 trial they wer 
of money they 
After they 
number on the 


were requesting on that trial 
this 
blackboard. For 
st ited 
amount requested was less than or equal to the 
amount that E the board S would 
the amount requested. If, however, 
S requested more than E subsequently wrote, 
l he bid. For 


the Nonpunish groups the penalty 


wrote number, £ wrote a 
the Punish 
the instructions that if the 
wrote oO 
POUL vA 
S would lose the amount that 
ior over- 
bidding as that S simply did not win any 

i nd otherwise the rules 
or the condition 
told 


number 


pur isk 
groups, imply 
; close as p sibl t the 


~ would write on kboard, with 


equ illy 
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hese differences in instructio provided 


to the different groups constituted the 
variable. 

For each bid, each S entered on his data 
sheet the amount he bid, whether he won or 
lost on that trial, and a running total of his 
winnings. (However, Ss in the Guess groups 
only wrote down the amount bid.) Bids were 
requested and amounts were written on the 
board by E at the rate of about 
every 15 sec. In all, the experiment involved 
100 such trials 

The numbers that E wrote on the black- 
board are called “input.’”” The numbers wer« 
randomly selected from particular distribu 
The distributions from which they 
were drawn differed for the different blocks 
of trials. For distribu 
tions were rectangular, the integers included 
all appearing with equal probability. In 
lable 1 the distributions are indicuted by the 
highest lowest integer used. For ex- 
ample, 8-15 represents the integers 8, 9, 10, 
11, 12, 13, 14, and 15. 


one bid 


tions. 


ill experiments the 


and 


RESULTS AND DISCUSSION 


Payoff variable: Guess.—Curves de- 
scribing performance under the Guess 
condition are presented in Fig. 1 and 2. 
Figure 1 presents the means and 
inter-.S SDs for ¢ srOUpsS Gla, Gib, and 
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Fic. 1. Means and inter-S SDs for the 


Guess condition with a constant input dis- 
ks 


tribution, averaged five-trial bloc 
(Also shown are the averages of the input 


distributions. 


over 


These coincide with the pre- 
dicted asymptotic response levels.) 
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Fic. 2. Means and inter-S SDs for the 
Guess condition with a shifting input dis- 
tribution, averaged over five-trial blocks. 
(Also shown are the averages of the input 
distributions. These coincide with the pre- 
dicted asymptotic response levels.) 


G3. As indicated in Table 1, the 
Groups Gla and Gib differed pro- 
cedurally from Group G3 only in 
having their input distributions dis- 
placed by five units. Thus, these data 
serve to indicate the relationship of 
the asymptotic response mean and 
variability to the input mean. Visual 
inspection of these data permits one to 
conclude that the mean asymptotic 
response level is equal to the mean of 
the input distribution and that the 
inter-S asymptotic variability is in- 
dependent of the input mean. 

Figure 2 presents comparable curves 
for Groups G2a and G2b. The input 
numbers for these groups on Trials 
1-30 and 61-90 were drawn from the 
integers 8-15 (the distribution used 
for Groups Gla and Gib). On Trials 
31—60 the input numbers were drawn 
from the integers 13-20 (the dis- 
tribution used for Group G3). The 
input numbers of Trials 91-100 were 
always 12, but the corresponding data 
will not be discussed for any of the 
groups. 


The data in Fig. 2 can thus be com- 
pared to the data from Fig. 1. They 
indicate that the groups exposed to 
shifts in the input distribution ap- 
proach asymptotes comparable to 
those attained by groups main- 
tained on the same input distribution 
throughout. ‘The inter-S variability 
is a negatively accelerated decreasing 
function of trials, except that if a shift 
is introduced, the variability ap- 
parently increases temporarily. 

Under this (Guess) condition, the 
regret equalization hypothesis _re- 
quires that the mean asymptotic 
response, b, satisly 


{b—1] r 


> (©—N;)p; = X (N; — 5)p, 


)=0 j=[b] 


where [5] is the smallest integer > }, 
N, is an input number and ); is the 
probability that N, is chosen on a 
trial. Solving 


given this equation 


yields 


b = > Nyp; = Nj, the input 


mean. 

The mean response curves thus are 
in line with the hypothesis. However, 
the symmetry of the situation is such 
that almost any conceptual model 
would make a similar prediction. 
Consequently, the predictions of the 
model are wext examined under the 
conditions‘ similar to those used here, 
but with an asymmetric value struc- 
ture superimposed. 

Payoff variable: Nonpunish.—Fig- 
ures 3 and 4 describe the performances 
of the groups that were given the 
Nonpunish instructions. The data of 
Fig. 3 are analogous to those of Fig. 1 
and the data of Fig. 4 to those of 
Fig. 2. Groups Nia and N1b differed 
from Group N3 in that N3 received 
an input distribution that was dis- 
placed by five units. Groups N2a and 
N2b received the same input dis- 
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tribution as Groups Nla and Nib on 
Trials 1-30 and 61-90, and the same 
distribution as Group N3 on Trials 
31-60. 

The mean response data are again 
characterized by apparently very 
stable asymptotes when the same in- 
put distribution is used throughout 
(Fig. 3). Again the differences be- 
tween Ni and N3 in the mean 
asymptotic response demonstrate the 
dependence of the response mean on 
the mean of the input distribution; 
while the similarity in the inter-S 
variability between these groups in- 
dicates that this measure is independ- 
ent of the input mean. However, the 
response asymptotes are no longer at 
the mean of the respective inputs, but 
rather are at a new value appreciably 
below the mean of the input. 

In the case of the shifting distribu- 
tion (Fig. 4), the response asymptotes 
are again predictable from the data 
of the groups receiving the same input 
distribution throughout. Again there 
is apparently a slight increment in 
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Fic. 3. Means and inter-S SDs for the 
Nonpunish condition with a constant input 
distribution averaged over five-trial blocks. 
(Also shown are the averages of the input 
distributions and the predicted asymptote for 
each group.) 
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Fic. 4. Means and inter-S SDs for the 
Nonpunish condition with a shifting input 
distribution averaged over five-trial blocks. 
(Also shown are the averages of the input 
distributions and the predicted asymptote for 
each group. The input distribution is the 
same as that of Fig. 2 for the Guess condition. ) 


response variability associated with a 
shift in the input distribution and a 
negatively accelerated monotonic de- 
crease in variability when the input 
distribution is not shifted. 

According to the regret equalization 
hypothesis, the asymptotic response 
mean should again be predictable. If 
the response is less than the corre- 
sponding input number, there is 
assumed to be a regret equal to the 
difference between the two. In case 
of an overbid, since S receives nothing, 
his regret is equal to the input. Thus, 
his asymptotic bid level 0} 
satisfy 


should 


[b—1] 
} N jp; = ss b)p 


j=[6) 


where the right side of the equation 
constitutes the expected regret due to 
underbidding and the left side con- 
stitutes the expected regret due to 
overbidding. 

Under the Nonpunish condition for 


the input 8-15, the equation is 
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Fic. 5. Means and inter-S SDs for the 
Punish condition with a shifting input dis- 
tribution averaged five-trial blocks. 
(Also shown are the averages of the input 
distributions and the predicted asymptote 
for each group. The input distribution is the 
same as that of Fig. 2 for the Guess condition 
and that of Fig. 4 for the Nonpunish con- 
dition.) 


over 


satisfied by b = 9.7 and for the input 
13-20, by 6 = 14.1. These predic- 
tions are indicated on Fig. 3 and 4 and 
are approximately at the observed 
asymptotes (within one standard er- 
ror). In contrast, it may be noted 
that asymptotes of 8 and 13, respect- 
ively, would be required if Ss were 
either to maximize expected payoff or 
to minimize expected regret. 

Payoff variable: Punish.—The data 
for the Punish groups are shown in 
Fig. 5 and 6. The results are con- 
sistent with those of the other condi- 
tions. That is, the mean response 
levels depend on the mean of the input 
distribution, and the inter-S SDs are 
negatively accelerated monotonic de- 
creasing except for slight increases 
when the input distribution is shifted. 

The underbidding regret is here 
identical to that of the Nonpunish 
cases. However, in case of an overbid, 
S loses the amount bid, so that the 
regret is defined to be the sum of the 


input and the response. 
totic bid 
satisfy 


The asymp- 
level b should, therefore, 


[b—1] z 

pi (N; + b)p; = L (N; — 5)p, 

j=0 j=[b] 

Values of b that satisfy this equation 
are 8.9 for the input distribution 8-15, 
13.2 for the input 13-20, and 18.1 for 
the input 18-25. The predicted 
asymptotes are indicated in Fig. 5 
and 6 and appear to be reasonably 
descriptive of the data. Comparable 
asymptotic predictions would be 8, 13, 
and 18, respectively, if Ss maximized 
expected payoff or minimized expected 
regret. 

It was noted above that the regret 
equalization hypothesis evolved from 
a consideration of the effect of over- 
or underbidding on a subsequent bid. 
The conceptualized process produces 
a decrease in the response level in 
proportion to the amount of regret 
experienced as a result of overbidding 
and an increase in the response level 
in proportion to the amount of regret 
due to underbidding. A detailed look 
at the trial by trial changes in the bid 
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Fic. 6. Means and inter-S SDs for the 
Punish condition with a shifting input dis- 
tribution having a larger maximum than used 
elsewhere. 





A REGRET EQUALIZATION MODEL 


TABLE 2 


CORRELATION BETWEEN RESPONSE CHANGI 
AND REGRET COMPUTED OVER 
30-TRIAL BLOcKs 


Trial Block 


31-60 


.672 
.700 
.641 


NN sh 


= 


level support this conceptualization. 
Let regret due to underbidding be 
arbitrarily defined to be positive and 
regret due to overbidding be defined 
to be negative. One use the 
product-moment correlation, r, be- 
tween amount of regret and amount 
of change in the bid level as an index 
of the extent to which this model of 
the effect of regret actually describes 
the data. Such correlations have been 
computed for each S in each experi- 
ment for the three successive blocks 
of 30 trials. The averages of these r’s 
taken over the group are shown in 
Table 2. 

For all groups, the 7 is reasonably 
large on the initial trial block. How- 
ever, those groups that received the 
same input distribution throughout 
the three trial blocks (G1, G3, N1, 
and N3) seem to show a sharp de- 


can 


crease in the degree of correlation 
between regret and response change 
over the three trial blocks. 


there is evidence that regret is indeed 


Thus, 


intimately related to response change, 
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but that with a constant population 
distribution for the input, this relation 
becomes less important. In addition, 
although no interpretation will be 


attempted here, it should be noted 
that all correlations for Guess groups 


are larger than any for the other 
groups. 

An attemp made to 
provide some additional insight into 
the values of the correlation coeffi- 
cients. It was noted that the SDs 
of the bids in a group also decreased 
to a low value over a series of trials, 
showing increases only when a change 
occurred in the input distribution. 
If the decrease in these SDs 
implied a decrease in the SD of a 
series of responses of an individual 
then that might be sufficient to 
account for the decrease in correlation 
of the regret with the change in bid 
level.- Accordingly, such intra-S SDs 
were computed for each S for each 
block of 30 trials. Averages of these 
intra-S SDs are shown in Table 3. 
Also given in this table is the SD of 


was also 


also 


rABLE 3 


INTRA-S SDs COMPUTED OVER 
30-TRIAL BLOCKS 


Trial Block 
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the input distribution. This value of 
2.29 is appreciably larger than the 
asymptotic SDs for at least some of 
the groups (notably Nia, Nib, and 
N3—the Nonpunish groups with con- 
stant input distributions). 

These data show that the Guess 
groups also tend to exhibit relatively 
larger intra-S SDs, and that appreci- 
able the intra-S SDs 
occur over the successive trial blocks. 
This is in line with the decreasing r’s 
of Table 3, since it is clear that the r 
must perforce be low if the variability 
in bid level is low. However, the 
patterns of the decreases of the two 
sets of data are different. Most of the 
decrease in correlation occurs only for 
the groups not getting shifts in the 
input distribution and it occurs on the 
last trial block. In contrast, all of the 
groups show a decrease in intra-S 
variability, and the decrease takes 
place primarily on the second trial 


block. 


decreases in 


No explanation is attempted here for 


This does not detract from 
the importance of these effects for learn- 
ing theory. The data demand that 
whatever theory or model is used to 


these effects. 


MAX S. SCHOEFFLER 


account for also 


them must produce 
these decrements in variability and cor- 
relation. In particular, an adequate 
theory must make the response vari- 
ability—both intra- and inter-S depend- 
ent on the stationarity of the input 
distribution. 


SUMMARY 


Subjects were instructed to ask for some 
number of ‘‘make-believe dollars’” (MBDs) 
or simply to guess a number which E would 
subsequently present. The payoff to S de- 
pended on the relation of S’s bid to E's 
number. Three conditions used to 
determine the payoff. In two of these, Ss 
were encouraged to bid high, but excessively 
high bids were punished. In the other condi- 
tion, over- and underbids were treated sym- 
metrically. A model was constructed which 
predicts the asymptotic bid level under these 
conditions to be at a point where the expected 
regret due to overbidding is equal to the 
expected regret due to underbidding. 

The results indicated: (a) The asymptotes 
of the bids depend on the payoff conditions 
and the distribution of input numbers as 
predicted by the model. (5) Both the inter- 
and the intra-S variability decrease over 
trials except when the distribution of input 
numbers is changed. (c) The increase or 
decrease in bid level on a trial is highly 
correlated with the regret associated with the 
preceding trial. 


were 


(Received December 11, 1961) 
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SOME EFFECTS OF THE PERCENTAGE OF RELEVANT 
CUES AND PRESENTATION METHODS 
ON CONCEPT IDENTIFICATION ! 


MARGARET JEAN PETERSON 


Indiana University 


In order to perform the complex 
discriminations necessary for success- 
ful solution of a concept formation 
problem, Ss must distinguish between 
dimensions which are relevant to the 
solution of the problem and those 
which are not. Predicting that adap- 
tation of the irrelevant and 
conditioning of the relevant cues 
would be facilitated by temporally 
proximate presentation of instances 
of the relevant dimension, the studies 
reported herein varied the proximity 
of relevant instances by presenting all 
relevant to concept 
before showing instances representing 


cues 


instances one 
another concept (homogeneous condi- 
tion), and by presenting the instances 
representative of three separate con- 
cepts in a mixed sequence (hetero- 
geneous condition). 


The percentage of relevant cues per 
problem was manipulated by 


using 
three relevant dimensions and 
irrelevant dimension for one problem 
(75%R); two relevant and two ir- 
relevant for another (50%R); and 
one relevant and three irrelevant for 
a third (25%R). 

Underwood (1952) emphasized that 
temporally contiguous presentation of 
stimuli which are instances related to 
the should 
learning by minimizing the interfer- 
ence effects that might be produced by 


one 


same concept facilitate 


Grant 
from the 
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assistance of 


interpolated instances of other con- 
cepts. Although massed practice has 
not always been with 
increased efficiency in solving prob- 
lems (Underwood, 1961), recent de- 
monstration by Cahill and Hovland 
(1960) of the importance of memory 
in the acquisition of concepts sug- 
gested the prediction that 
geneous presentation would favor 
faster learning than would _ hetero- 
geneous presentation of the relevant 
Further, an 
between the two variables was pre- 


associated 


homo- 


instances. interaction 
dicted; namely, that the advantages 
of homogeneous presentation would 
be greater for the 25%R_ problems 
than for the 75%R problems. 


EXPERIMENT | 
Method 


Stimulus three-valued di- 


(small, 


one, two, three r 


material SIX 


mensions were used: size medium, 


large); number of figures 
form of the figures (circle, triangle, square 

number of lines on the edges of the cards (one, 
two, three); color (red, blue, green and 


position of the figures on the cards (right, 
middle, left) 
sions were randomly drawn for the problem 
subject to the restriction that each dimension 
be represented equally often as relevant or 


From this population, dimen- 


irrelevant over the entire set of problems 
he stimuli were 
on white 4 X 6 in 
the irrelevant 


painted with poster paint 
All possible com 


and the relevant 


cards 
binations of 
dimensions appeared in a given problem deck. 
relevant dimen- 


sions for the 50°, R and 75°; 


Instances representing the 
R problems 
were paired using a table of random numbers 
so that, for example, if color and number of 
figures were relevant, the single figures were 
always painted blue; two figures were always 
red; and three figures, green. 
used were held constant on all cards within a 


Dimensions not 
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TABLE 1 


TRIALS TO CRITERION AND CORRECTLY IDENTIFIED DIMENSIONS AS A FUNCTION 
OF CONDITIONS AND PERCENTAGES OF RELEVANT DIMENSIONS: Exp. I ANp II 


Trials 


Experiment and 
Condition 


Mean | Mdn.| SD | Mean} Mdn 


Experiment I 
Heterogeneous 
Homogeneous 


23.67 
4.69 


20.16 
» 9? 


14.00 7.00 


13.94 
3 4.00 


56 
Experiment II 
Proximate (P) 
Spaced, Filled (Sp) 
Spaced, Unfilled 
(Sp) 


4.88 
7.83 
3.83 


3 4.25 
4.44 
7 3.00 


4.00 
6.00 
2.00 


4.4 
4.1 
2.6 


problem deck, e.g., when color was not used 
as either a relevant or an irrelevant dimension, 
all stimuli were painied gray. . The 3 X 2 
factorial design consisted of the three per- 
centages of relevant dimensions and the two 
methods of presentation. Each problem was 
presented under both methods of presenta- 
tion. Each S solved all six problems whose 
sequential order of appearance had been 
determined by a Latin square design such 
that each problem appeared equally often in 
every ordinal position. 

A pparatus.—Cards were viewed through a 
vision mirror mounted in a large 
The S indicated his response 
by pressing one of three telegraph keys. A 
reinforcing light was placed immediately 
above each key. 

Subjects and 
from 


one-way 
black screen. 


procedure.—The Ss 
36 introductory 


courses at Indiana University. 


were 
psychology 
Experimental 
participation was a course requirement. After 
instructions defining S’s task, including an 
enumeration of the possible bases for dis- 
crimination, a set of practice cards which had 
the letters A, B, or C on them was shown to S 
until he had responded correctly three times. 
Homogeneous presentation consisted of show- 
ing cards representative of one concept until 
S had correctly identified the cards three 
consecutive times. Then representations of 
the second concept were shown to the same 
criterion, followed by the presentation of the 
third concept. Concept refers to the relevant 
of 
with one of the response keys so that Ss were 
said to three to 
problem. 


students 


stimulus characteristics cards associated 
define a 
If color were the relevant dimen- 
sion, learning Response A to red represented 


one concept; Response B to blue was the 


learn concepts 


to Criteri 


‘orrectly Identified 
Dimensions 


Mean Mean Mean 


second; and Response C to green, the third. 
In heterogeneous presentation instances of the 
three concepts were assigned randomly with 
the restriction that one instance of the three 
concepts appear in each block of three cards. 
Che Ss were run toa criterion of 9 consecutive 
correct responses or until 54 trials had been 
completed. Each S was then queried about 
the basis for solution of the problem before 
going on to the next. 


Results and Discussion 


Generally fewer were 
required to reach the criterion follow- 
ing homogeneous presentation than 
following heterogeneous presentation. 
Higher percentage relevant problems 
were learned more rapidly than low 
percentage ones (Table 1). 

A Friedman two-way analysis of 
variance (Siegel, 1956, p. 166), em- 
ployed because of the heterogeneity of 


responses 


, 


variance, yielded a significant x’, of 
18.35 (df = 2, P .001). The chi 
square for the three levels of percent- 
age of relevant dimensions with heter- 
ogeneous presentation was also signi- 
ficant (x?, = 27.56, df = 2, P < .001). 
The z conversions for the sign test 
(Siegel, 1956, p. 72) used to assess the 
differences between presentation con- 
ditions within the 25%R, 50%R, and 
75%R conditions were 3.04 for 25%R 
(P = .0012), 2.70 for SO%R 
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(P = .0035), and 1.33 for 75%R 
(P = .0934). The predicted inter- 
action was found : homogeneous pres- 
entation had a greater facilitating 
effect for the low percentage of re- 
levant than for higher 
ones. 


dimensions 


Hull (1920) in his classical study of 
concept identification using Chinese 
characters noted that Ss were not 
necessarily able to define verbally 
the property common to a specific 
concept even though they could assign 
stimuli to concepts correctly. In 
contrast, Bourne and Haygood (1959) 
reported that their Ss were almost 
always able to label the correct dimen- 
sions, even when more than one 
dimension was relevant for a_par- 
ticular problem. ‘Table 1 contains an 
analysis of verbal identification in the 
present experiment. The mean num- 
ber of correctly identified dimensions 
increased significantly from the heter- 
ogeneous to the homogeneous pres- 
entation (z transformation of sign 
= 2.94, P = .0016) and increased 
as the percentage of relevant dimen- 
sions increased (Friedman x”, = 14.59, 
df = 2, P < .001). The proportions 
of the number of correct identifica- 
tions relative to the total number of 
correct dimensions that could have 
been named demonstrated clearly that 
the majority of Ss were reporting only 
one dimension, even when additional 
dimensions could have used : 
2 of the 36 Ss identified both correct 


test 


been 


dimensions for the 50%R _ problems; 
16 of the 36 Ss identified two of the 
three the 
75%KR problems; but no S identified 
The 


assignments of concept instances to 


correct dimensions for 


all three. numbers of correct 


the response keys were not reported, 
the 


since the p’s’ with number of 


correct identifications of the dimen- 


sions ranged from .82 for the 25%R 


problenis to .96 tor 
problems. 


EXPERIMENT I] 


The superiority of homogeneous 
presentation in Exp. | may have 
resulted from the closer proximity of 
instances of a given concept in that 
condition. Another possibility is that 
the absence of interference from 
presentation of instances of other 
concepts permitted faster learning. 
In Exp. II the problems were pre- 
sented using a homogeneous sequence 
while preserving the exact temporal 
ordering of the instances in the related 
heterogeneous condition of Exp. I. 
The intervals were filled with a digit 
cancellation task for one group and 
left unfilled for another. The control 
Ss learned the problems using the 
homogeneous condition of Exp. I. 


A Tethod 


Both of the two experimental conditions, 
temporally proximate (P) and spaced (S), 
used the homogeneous sequence of presenta- 
tion of Exp. I. Variations in 
separation of the concept instances dis- 
tinguished the two conditions. Condition P 
was identical with the homogeneous condition 
of Exp. I. In Cond. S, instances of the first 
concept were shown using the temporal 
intervals which existed in the heterogeneous 
problem of the same percentage of relevant 
dimensions, but the 
filled with 
unfilled 


Os sat 


temporal 


either 
digit cancellation (Sp) or left 
(Sp). During the unfilled intervals 
silently in front of the darkened 
Then instances of the 
concept arranged to simulate its heterogene- 
ous problem presentation were 
shown followed by the simulation of the 
third. For all groups, instances of one concept 
wére presented until S had emitted three 
consecutive correct responses before instances 
of the second were presented. 
rhe Ss, 36 from 
psychology courses at Indiana University who 
had not participated in similar experiments, 
were assigned randomly with the restriction 
that an equal number of Ss experience each 
condition. The remainder of the experimental 
procedure was identical to that of Exp. I. 


intervals were 


aperture. second 


sequence 


students introductory 
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Results and Discussion 


Application of a Wruskal-Wallis 
one-way analysis of variance of trials 
to criterion (Siegel, 1956, p. 184) 
yielded an H of 6.93 comparing the 
three conditions of presentation 
(Table 1) for the 25%R_ problems 
which, with 2 df, was significant be- 
tween the .02 and .05 levels; the 
comparable Hs for the 50%R_ prob- 
lems (4.66) and the 75%R problems 
(2.66) were associated with prob- 
abilities greater than .05, both with 
2 df. Differences between the tem- 
porally proximate presentation of the 
25%R problem and either of the 
spaced methods of presentation were 
not statistically reliable ; however, the 
use of digit cancellation did signifi- 
cantly increase the number of trials 
required for solution relative to the 
spaced condition with an _ unfilled 
interval (P = .018 using the median 
test, Siegel, 1956, p. 111). Ap- 
parently, lengthening the interval 


between instances of a concept did not 
in itself significantly slow learning. 


Rather, interference from instances 
displaying another concept or the 
introduction of another task such as 
digit cancellation appeared to retard 
learning, particularly with problems 
characterized by a low percentage of 
relevant cues. 

Results of Ss’ identification of the 
correct dimensions reflected the trends 
shown in Exp. Il. The higher the 
percentage of relevant cues the more 
frequently Ss were able to label at 
least one correct dimension, although 
the different manipulations of the 
homogeneous condition in Exp. II 
were not portrayed in these data. 
Again, few Ss identified more than one 
correct 


dimension. No Ss reported 


two for the 50%R_ problems, nine 
reports of two correct dimensions were 


given for the 75%R problems, and two 


MARGARET JE 
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three dimensions were 
Differences in the mean num- 
ber of trials to criterion and in the 
mean number of correctly identified 
dimensions between comparable con- 
ditions of Exp. I and II were not 
statistically significant, both 
squares being less than 1. 


reports ol 


made. 


chi 


DISCUSSION 


Because lengthening the interval be- 
tween presentations of instances of the 
same concept did not have a significant 
effect upon the learning of concepts, the 
efficacy of homogeneous presentation did 
not appear to reflect massing of practice, 
per se. Introduction of conditions which 
would be expected to increase the likeli- 
hood of some kind of interference such 
as the heterogeneous method of presenta- 
tion or the digit cancellation task was 
associated with slower learning of the 
concepts, particularly when the concepts 
to be learned contained a low percentage 
of relevant dimensions. It is possible 
that high percentage of the 
dimensions were relevant the problems 
were learned so rapidly that these factors 
became relatively unimportant or exerted 
an influence too transitory to be reflected 
in the employed. 
Furthermore, an unpublished replication 
of Exp. I, using different dimensions to 
constitute the problems, yielded almost 
identical results. 

Another 
quent 


when a 


response measures 


observation the infre- 
identification of more than one 
correct dimension even when additional 
dimensions were available and each S 
had been told what dimensions might be 
used. The assumption might be made 
that as relevant dimensions were added, 
the stimulus pool from which S sampled 
would have increased so that Ss were 
actively selecting from a larger popula- 
tion of relevant cues for the problems of 
higher percentage of relevant cues then 
More 
the data would be the 
interpretation that over a group of Ss the 
probabilities increased that each S 
identify at least one correct dimension 


was 


for problems of lower percentage. 
in accord with 
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without necessarily having been able to 
report the presence 
dimensions. 


of other relevant 


SUMMARY 


[wo experiments examined the effects of 
the variation of the percentage of relevant 
dimensions and the method of presentation 
of concept instances on 
identification. 
50%, and 75% 
factorially with 


rate of concept 
Problems consisting of 25%, 
relevant cues were combined 

four different 
Instances of one concept were presented until 
the criterion of learning had been achieved, 
then instances of the second concept were 
presented followed by the third for the homo- 
geneous condition. In the heterogeneous 
condition, instances of the three 
were presented in a random sequence 


dimensions. 


concepts 
The 
predictions that the number of responses prior 
to criterion would be inversely related both 
to the percentage of relevant cues and to the 
temporal proximity of the instances associated 
with a 
Homogeneous 


given response were supported 
more ad- 
vantageous with 25° R than with 50% R and 
75° R. Experiment II demonstrated that 


the lesser efficiency of heterogeneous presenta- 


presentation was 


tion was not a function of the greater tem- 


poral intervals occurring between instances 


of the same concept, but rather of inter- 
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ference effects from other concepts, at least 
with 25% R problems. 

Analyses of correctly identified dimensions 
suggested an interaction effect between the 
percentage of relevant cues and the method 
of presentation. Few Ss reported the presence 
of more than one relevant dimension for the 
problems with two or three completely re- 
dundant relevant dimensions. 
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EASE OF CONCEPT ATTAINMENT AS A FUNCTION 
OF ASSOCIATIVE RANK! 


SARNOFF A. MEDNICK? 


University of Michigan 


Underwood (1952) has suggested a 
method for the study of concept 
formation which assumes that the 
attainment of a concept calls for the 
perception of a relationship between 
concept instances. The perception of 
this relationship, in part, depends on 
the probability of the occurrence of 
the relevant associative response to 
the concept instances. This prob- 
ability is termed response dominance. 
The mean response dominance of all 
instances representi'g the concept is 
termed dominance | vel. Underwood 
and Richardsor .956b) have shown 
that the ease attainment of a 
concept is directly related to its 
dominance level. 


This study explores a methodo- 
logical variable which determines ease 


of concept attainment. The variable 
under investigation is the rank posi- 
tion of the concept response in the 
associative hierarchy of the concept 
instance. To the concept instance 
BELLY the sensory associate ROUND is 
of Rank 1 with a dominance level of 
43% (Underwood & 
1956a). The sensory associate SOFT 
is of Rank 2 with a dominance level 
of 24%. 


Richardson, 


To the concept instance, 
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PAIL the sensory associate METALLIC 
is of Rank 1 and has dominance level 
of 24%. While METALLIC to PAIL and 
SOFT to BELLY are equal in response 
dominance they vary in their posi- 
tions in their respective associative 
hierarchies: METALLIC is of Rank 1; 
while sort is of Rank 2. 

This experiment compares ease of 
attainment of concepts as a function 
of the rank position of the concepts in 
the associative hierarchy of the con- 
cept instances. For reasons developed 
below it is predicted that first ranking 
concepts will be attained in fewer 
trials and with fewer errors than will 
second ranking concepts. 


METHOD 


Lists—The words used were concrete 
nouns selected from a list of 213 nouns for 
which Underwood and Richardson (1956a) 
have ascertained the dominance level of 
various responses. As can be seen in Table }, 
four groups of instances were assembled: 
associative Rank 1 (AR 1) waite, AR 1 
ROUND, associative Rank 2 (AR2) white, 
and AR 2 rounpD. This was necessary be- 
cause of the possibility that the concepts 
might differ in difficulty or that concept 
difficulty might interact with associative 
rank. List 1 consisted of AR 1 ROUND and 
AR2 waite while List 2 contained the other 
two concepts. A _ buffer concept “LONG” 
included in both lists (EEL, BEAK, ALLEY, 
CUCUMBER) was used to make it more difficult 
for Ss to attain the concepts by elimination 
procedures. As is shown in Table 1, the 
mean dominance levels of concepts were kept 
nearly constant. In constructing the AR 2 
concepts care was taken to avoid having the 
concept instances elicit a 
ranking response. 


common | first 

The instances were presented to S in three 
random orders. The same three orders were 
used for both lists with positions occupied by 
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CONCEPT ATTAINMENT 


TABLE 1 


ASSOCIATIVI 
DOMINANCI 


EPT Lists WITH 


Associative 
Rank in 
Hierarchy 


Do 


POT 

EYI d 

DIME YY 
GRAPI ; 


FROST 
GARDENIA 
LARD x1 
BONI 


AR 1 instances in List 1 being occupied by 
AR 2 instances in List 2. The positions of the 
buffer terms were not changed between lists, 
although they were randomized in the three 
orders that were used. 

Subjects—The Ss were 30 undergraduate 
paid volunteers. The 15 men and 15 women 
were divided as equally as possible between 
the two lists. 

Procedure.—The lists were presented at a 
rate Gerbrand’s type memory 
drum. A interval occurred between 
presentations of the lists 

The Ss were informed that the list con- 
tained 12 words that could be placed in three 
groups of 4 words each, and that all of the 4 
in each of these groups could be 
described by the same adjective. The Ss 
were required to respond to each word. The 
task criterion of 
perfect trial, or terminated at 20 trials. A 
more complete discussion of the materials 
and procedure may be found elsewhere 
(Freedman & Mednick, 1958). 


4-sec. on a 


12-sex 


words 


was continued to a one 


RESULTS AND DISCUSSION 


The data which were subjected to 
analysis were the number of trials to 
one perfect trial on a concept (this 
riving the correct 
response all four ol a 
concept on the same trial) and the 
number of errors made on each con- 


meant concept 


to instances 


cept in the entire course of the experi- 
ment. As the buffer 
concept LONG was omitted from this 


noted above, 


RANK, RESPONSI 
LEVEL INDICATED 


Concept and Mean 
inance 


Round 


( 


White 


ri 


DOMINANCE, AND MEAN 


List II 


Response | Concept and Mear 
Dominance Do 


Noun 
inance 
32% 
28‘ 

29% 


HOSPITAL 
ENAMEL 
GOAT 
BREAD 


White 
319, 


Pe) 


21 
28% 
24% 


»>¢ 


BADGI 
PILL 
WAIST 
CAPSULI 


Round 


24% 


analysis. AR 1 concepts (List 1, 
WHITE, List 2, ROUND) were compared 
with AR2 concepts (List 1, ROUND, 
List 2, WHITE). ‘Two List 2 Ss failed 
to solve any concept and were dropped 
from further analysis. 

The AR 1 concepts were attained 
earlier. The mean number of trials 
taken to solve each concept was 5.90 
for the AR 1 concepts and 8.09 for the 
AR 2 concepts, a significant difference 
(¢ = 2.96, df = 27, P < O01). The 
mean number of errors on the AR 1 
and AR 2 concepts were 12.17 and 
18.14, respectively. This difference 
was significant (¢ = 2.09, df 27 
P < .05). 


The results are intuitively satisfying 
but detailed analysis of their interpreta- 
tion intricate. We have 
found that AR 1 concepts are attained 
more easily than AR 2 concepts despite 
the fact that their dominance levels are 
equal. Above, we have referred to these 
ranks as indicating a position in 
hierarchy. However, 
hierarchy is, in a sense, a figment. 


is somewhat 


an 
this 
T he 


associative 


norms which provide us with dominance 


ley nd ranks are based on Ss giving 
si! isory associates to each of 213 
Actually, while WHITE 
ranking response to GARDENIA 


first ranking response to ENAMEL 


is a 
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28% ol 


then 


the 


first and only 


norm WHITE as 


response to 


group Pave 
these 
nouns, and in both cases, 72% gave some 
other response. Thus, when we refer 
to these concept responses as occupying 
positions in an S’s associative hierarchy 
we are making the implicit assumption 
that the associative hierarchy produced 
by collating the group’s single responses 
is reflected to a large extent, in each 
individual. In other words, we are 
assuming that everyone has just about 
the same basic associative hierarchy; the 
fact that get variation in single 
response norms we would then attribute 
to momentary fluctuations in associative 
strength. Thus, if Underwood and 
Richardson (1956a), had asked their Ss 
to give more than one response to each 
noun, a large proportion of the 72% that 
did not give WHITE as their first response 
to ENAMEL would have given it as their 


second or third response. If this 


we 


situation were applied to the present 
experiment then the superiority of the 
AR 1 concepts would be understandable. 
The AR 2 concept responses occupy an 


inferior position (relative to the AR 1 
concept responses) in almost everyone’s 
associative hierarchy. This means the 
\R 1 concept responses would be elicited 
earlier in the course of the experiment. 
This experiment may then be seen as 
supporting this assumption of homo- 
geneity of Research on 
word associations (Cofer, 1958; Rosen & 
Russell, 1957) has contributed consider- 


hierarchies. 


AND SHARON HALPERN 


able support to this same assumption in 


another conte 


SUMMARY 


Thirty Ss were presented with lists of 12 
nouns and instructed to discover into what 
three groups the nouns could be divided and 
what adjective could describe each group. 
The lists consisted of concepts of equal levels 
of dominance; the position of the concept 
responses in the associative hierarchy was 
manipulated. The concepts having higher 
rank position in the associative hierarchy 
were attained more quickly and with fewer 
errors. 
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CONCEPT 


IDENTIFICATION 


UNDER MISINFORMATIVE 


AND SUBSEQUENT INFORMATIVE 
FEEDBACK CONDITIONS ! 


WALTER J. JOHANNSEN 


Veterans 


Recent research on human concept 
identification has aimed at delineating 
the effects of feedback class on attain- 
ment rate. In particular a recent 
study by Pishkin (1960), using mis- 
informative feedback (MF), reveals 
striking decrement when even small 
percentages of erroneous, task rele- 
vant feedback information are in- 
serted into schedules of informative or 
correct feedback (IF). The present 
study seeks to extend Pishkin’s results 
by assessing the effect of MF on 
subsequent concept identification un- 
der conditions of 100% IF. 

Appropriate design makes possible 
the concurrent examination of 
other, associated question. Pishkin 
found probability matching behavior 
in his concept identification study, as 
did Goodnow and Postman (1955) in 
a study using MF by implication. On 
the other hand, Morin (1955), who 
made use of MF in a simpler learning 


at- 


situation, was unable to demonstrate 
an adequate match in his data. He 
suggested that the failure of the ob- 
tained curves to approach an asymp- 
The 


to 


tote was a factor in his results. 

present experiment designed 
circumvent this problem by carrying 
performance under MF/IF to a point 


1S 


1 The statistical analysis of this paper was 
carried out in part under contract with the 
Wisconsin Alumni 
The author wishes to express his appreciation 
to the of Wisconsin Numerical 
Analysis Laboratory and to E. James Archer 
for assistance with computations of the trend 
test; and to Conrad Nuthmann, Samuel H. 
Friedman, H. Allen and Richard M. 
Lundy for their critical comments. 


Research Foundation. 
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Adminisiration Center, Wood, Wisconsin 


where asymptote is 


approximated. 


more nearly 

A final interest of this paper is the 
description of acquisition curves under 
MF/IF. Pishkin’s analysis, related 
to the Restle (1955) discrimination 
learning model, is unconcerned with 
the nature of the attainment process. 
Yet Morin’s trend analysis of his 
data suggests a more complex process 
than that typical of the probability 
matching studies. It is of interest to 
determine whether these findings can 
be replicated in data derived from a 
more difficult task. 


METHOD 


Experimental conditions.—The Ss were ran- 
domly assigned to one of four MF/IF condi- 
tions and one of three task complexity con- 
ditions. The percentages of MF/II 
ployed were 0:100, 12.5:87.5, 25:75, 
37.5:62.5. Task complexity simul- 
taneously manipulated by varying the 
number of dimensions irrelevant to problem 
solution while holding constant the number of 
relevant dimensions. A single dimension was 
relevant for all conditions, and either 1, 3, or 6 
dimensions irrelevant The design 
therefore describes a 3 X 4 orthogonal plot 
Ten Ss were tested in 
cells. 


em- 
and 
was 


were 
“ach of the 12 resulting 


The i24 
psychology students attending the University 
of Wisconsin All for the 
experiment in order to gain credit applicable 
to their class grades 


Subjects Ss were sophomore 


had volunteered 


Four Ss were eliminated 
for failure to comply with instructions 
Apparatus t 
sisted of geometric figures drawn on 3 X 5 in. 
cards. Each card was inscribed with a single 
figure. Figures varied to the 
following dimensions within 
form size 
(large-small), location (center-right of center), 


and procedure Stimuli con- 


according 
and values 


dimensions: rectangle-triangle 
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TABLE 1 


DUNCAN RANGE ANALYsIS: MEAN 
PERFORMANCE SCORES DURING 
MF/IF TriaAts 


Condition — SD 


@: DI) 


198.8 .83 
194.4 5.02 
186.6 9.29 
173.8 22.26 
142.5 33.89 
142.0 28.99 
141.4 32.79 
119.6 15.20 
105.: 10.39 
102.7 10.60 
100. 5.26 
96. 13.41 


rnans © 


Vv 


° Nm. ° Ps 
Maan 


w 
~ 


Note.— Means joined by vertical line do not differ 
significantly; means not so joined are significantly 
different (P < .05). 


position (vertical-horizontal), figure color 
(black-blue), ground color (red-white), dot 
within figure (presence-absence) 

The apparatus consisted of a 20 X 36 in. 
flat-black panel mounted vertically on a table 
of normal height, which served to separate S 
from E. A 3 X 5 in. aperture was cut into 
the center of the panel slightly below eye 
level. Stimuli were manually inserted into 
this aperture from the rear. 

Two 7.5-w. bulbs, one red and one white, 
were mounted side by side in sockets set 6 in. 
apart, 8 in. above and to either side of the 
presentation aperture. These lights served 
as feedback signals and were controlled by 
E, using two Western Union telegraph keys. 

Instructions were read to S, informing him 
that he was to take part in a concept identi- 
fication experiment, and that his task would 
involve the classification of cards placed 
before him. Specifically, S was told to label 
each card either A or B, with each A card 
having something in common and each B 
card having something in common. The 
flashing of a white light would indicate a 
correct response and a red light an incorrect 
response. No references were made to the 
presence of MF. Groups serving under the 
different dimensions-irrelevant (DI) 
tions were read supplementary instructions 
in accordance with Hovland’s (1952) pro- 
cedure, in which Ss are informed of the range 
of values and dimensions available to them. 
For all groups Category A constituted a 
vertical figure and B a horizontal figure. 
Following these instructions questions were 


condi- 


answered with a paraphrase of the original 
instructions. 

A schedule of MF/iF was developed for 
each condition. The occurrence of MF was 
randomized within each block of 10 trials, 
but with each block receiving approximately 
the same number of MF trials. All Ss within 
a feedback group performed under the same 
schedule. 

All experimental Ss received 200 trials 
under MF/IF conditions and were then 
shifted to a 100% IF schedule until a criterion 
of 10 successive correct responses was 
ichieved. Control Ss who usually made long 
runs of correct responses in less than 100 trials 
were terminated after 20 successive correct 
responses and, for purposes of analysis, were 
credited with an additional number of correct 
responses equal to the difference between 200 
and the number of the terminal trial. 

On a given trial E randomly selected a 
stimulus card from the shuffled pack before 
him and placed it in the presentation aperture 
in front of S. After S responded verbally, E 
recorded the classification of the card (A or B) 
and whether S had responded correctly or 
incorrectly. After reference to the MF/IF 
schedule, E determined whether MF or IF 
was to be administered on that trial and 
pressed one of the two keys to signal feedback. 
Average time per trial was approximately 
10 sec. 


RESULTS AND DISCUSSION 


Performance under MF/IF condi- 


the 
preliminary 
analysis of variance was performed on 
the mean number of correct responses 


tions.—Prior to examination of 


acquisition process, a 


Fic. 1. Mean number correct responses 
per 20-trial block under MF/IF and sub- 
sequent IF conditions. (Parameter is MF 
percentage. One irrelevant dimension.) 
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Fic. 2. Mean number correct responses 
per 20-trial block under MF/IF and subse- 
quent IF conditions. (Parameter is MF 
percentage. Three irrelevant dimensions.) 


during the MF/IF block, but also 
including control conditions. The 
results demonstrate significant differ- 
ences as a function of DI (F = 14.60, 
df = 2/108, P < .O1), of MF&% 
(F = 668.20, df = 3/108, P < .01) 
and of the interaction between the two 
variables (F = 6.98, df = 6/108, 
P < 01). A supplementary Duncan 
range analysis, reported in Table 1, 
indicates the ordering of means and 
the position of significant differences 
dividing the cells. 

As anticipated, the increasing num- 
ber of DI is related to poorer perform- 
ance, in agreement with earlier re- 
search (Archer, Bourne, & Brown, 
1955). The extremely large F ratio 
ascribed to MF% is partially a func- 
tion of bias introduced by inclusion of 
control conditions where optimal per- 
formance was reached early in train- 
ing. Examination of the Duncan 
range results indicates that the num- 
ber of correct responses diminishes 
regularly with increasing stimulus 
difficulty and MF%, although a few 
inversions of order exist. 

In order to analyze the acquisition 
process under MF/IF conditions, a 
trend (Grant, 1956) was per- 
formed on the group scores, with the 
number of correct responses per 20- 


test 
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trial block providing the raw data. 
Performance curves obtained from 
each of the 12 conditions appear in 
Fig. 1-3. Because the small cell fre- 
quencies and the limitations on the 
range of possible cell scores yielded 
truncated distributions, an arc-sine 
transformation was performed (Sne- 
decor, 1946, p. 445) and analysis 
conducted on the transformed data. 
Significant differences between group 
means occur as a function of MF% 
(F = 38.53, df = 2/81, P < .01) and 
DI (F = 10.33, df = 2/81, P < .01) 
when control data are omitted. Elim- 
ination of the control cells reduces the 
MF Xx DI to the extent 
that it is no longer significant. The 
overall acquisition curve appears com- 
plex, consisting of significant linear 
(F = 73.13, df = 1/81, P 01), 
quadratic (F = 31.56, df = 1/81, 
P < 1), and cubic (F = 10.33, 
df = 1/381, P .01) components. 


interaction 


Group differences occur only in the 


different MF 
91.25, df = 2/81, P 
These results are cc 
Morin’s observation of 
ponents of a_ higher 
quadratic. 

Performance under subsequent 100% 
IF conditions.—Analysis of subse- 
quent learning under 100% IF condi- 


slope of the 


(F = 


curves 
01). 
with 
com- 
than 


isistent 
curve 
order 


Fic. 3. Mean number correct responses 
per 20-trial block under MF/IF and subse- 
quent IF conditions. (Parameter is MF 
percentage. Six irrelevant dimensions, ) 
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tions involved several procedural 
problems. A few Ss serving in the 
12.5% MF cells achieved levels of 
errorless performance during the last 
block of MF/IF trials. To drop these 
Ss from the succeeding IF series would 
have resulted in a sampling bias, in 
the sense that the “better learners”’ 
would be eliminated from the simpler 
conditions and retained in the more 
difficult conditions. The alternative, 
which was adopted, was to retain 
these Ss and require achievement of 
the same criterion as the other Ss. 
The net effect would be slightly to 
enhance the probability of a spuri- 
ously significant difference between 
MF groups on learning under 100% 
IF conditions. 

Control of differential group attain- 
ment under MF/IF was also needed. 
It was reasoned that performance 
differences under 100% IF was a 
joint function of DI, prior learning, 
and the residual effects of the termi- 
nated MF. Thus, analysis of variance 
dealing with trials to criterion under 
100% IF would yield an apportioning 
of variance attributable to the effect 
of experimental variables combined 
with the effect of previous learning. 
An analysis of covariance, partialing 
out the effect of earlier learning, 
would allow evaluation of the experi- 
mental variables alone. 

Since a Pearson r revealed high 
correlation between within-group vari- 
ances and means for the 100% IF 
square-root transformation 
was performed on all trial-to-criterion 
to reduce the effect. 
An analysis of variance was then per- 


cells, a 
scores in order 


formed comparing the transformed 
trial-to-criterion scores of the experi- 
mental groups with those obtained by 
the control groups in order to deter- 
mine whether the 200 MEF/IF trials 
had acted to increase the number of 
trials to achieve a level of 10 successive 
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correct responses. The results showed 
that only DI (F = 17.58, df = 2/108, 
P < .01) and the MF X DI inter- 
action (F = 2.92, df = 6/108, P < .05) 
reached significance. Trials needed to 
achieve criterion were not increased 
as a function of different percentages 
of MF administered during the 
MF/IF block. 

Disregarding the control data, anal- 
ysis of variance performed on the 
transformed trials-to-criterion scores 
of the experimental cells provides a 
similar picture. Here only the differ- 
between 


ences DI groups achieve 
significance (F = 10.31, df = 2/80, 
P < .01). However, if the effect of 


prior learning is partialed out (in 
terms of terminal level of performance 
during the last two blocks of MF/IF), 
the MF variable becomes significant 
(F = 6.78, df = 2/80, P < .01) and 
the effect of DI is sharply diminished. 
Thus performance under subsequent 
IF conditions is affected by the 
terminated MF trials, but the effect is 


complex and requires further ex- 
plication. 
Probability matching.—Probability 


matching behavior was evaluated by 
subtracting each S’s attained number 
of correct responses on the terminal 
40 MF/IF scores from the theo- 
retically expected scores. These dif- 
ference scores were evaluated by 
separate ¢ tests for each cell. The 
results present a complex picture. 
Adequate matching was noted on 
five of the nine experimental cells: 
the three 12.5% MF groups, the 25% 
MF-1 DI, and the 25% MF-3 DI 
conditions. Significant negative de- 
viations from probability matching 
were noted on all three 37.5% MF 
cells and a significant positive devia- 
tion on the 25% MF-1 DI cell. The 
breakdown of probability matching in 
the 37.5% MF 
worthy but not 


conditions is 
unique, 


note- 
since a 
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similar phenomenon occurs in Pish- 
kin’s most difficult MF conditions. 
One possible explanation resides in 
the fact that Ss’ response 
patterns which yield positive rein- 
forcement on 67.5% of the trials are 
not distinguishably more effective 
than use of patterns which 


use of 


are 


successful on 50%, a level which could 
be reached by randomly responding. 


SUMMARY 


The effect of percentage of misinformative 
feedback (MF: 0, 12.5, 25, 37.5%) and the 
number of dimensions irrelevant to solution 
(DI: 1, 3, 6) on 
identification and on subsequent performance 
under 100°, informative feedback (IF) were 
investigated. A total of 120 Ss served, with 
90 experimental Ss being administered 200 
MF/IF trials, then shifted to 100% IF until 
criterion was reached. 


acquisition in concept 


Under MF 
significant differences ‘ 
MF, DI, and MF X DI, with 
increasing MF and DI leading to 
performance. (b) Trend analysis on blocks of 
trials under MF/IF 
posed of 


The results were: (a IF condi- 


tions occurred as a 
function of 


poorer 


revealed a curve com- 
significant liner, 


components; 


quadratic, and 
cubic the linear component was 
significantly affected by MF%. (c) Sub- 
100% IF learning was significantly 
affected by DI; inclusion of control Ss in the 


sequent 
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analysis lead to MF X DI achieving signifi- 
cance. (d) Analysis of covariance on the 
100% IF data, partialing out the effect of 
prior learning, revealed only a significant MF 
effect. (e) Probability matching appeared in 


five of nine MF/IF cells. 
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RESISTANCE TO EXTINCTION AS A JOINT FUNCTION 
OF REWARD MAGNITUDE AND THE SPACING 
OF EXTINCTION TRIALS! 


WINFRED F. HILL anp NORMAN E. SPEAR 


Northwestern University 


The effect of reward magnitude on 
resistance to extinction is an unsettled 
question, even for that subset of 
studies in which the independent 
variable is the weight of food on 
a continuous reinforcement schedule 
and the dependent variable is the 
running speed of rats. Metzger, 
Cotton, and Lewis (1957) and Zeaman 
(1949) found that a larger reward gave 
faster running early in extinction, 
with the group curves tending to con- 
verge as extinction proceeded. This is 
what would be expected if in extinc- 
tion K (Hull, 1951; Spence, 1956) 
adjusts to the absence of reward from 
different levels. On the other hand, 
Armus (1959) and Hulse (1958) 
found faster running throughout ex- 
tinction after a smaller reward. This 
might reflect a contrast or depression 


effect for the large reward group 

The most prominent difference in 
procedure between these two sets of 
studies was in the distribution of the 


extinction trials. Metzger, Cotton, 
and Lewis and Zeaman gave massed 
extinction, whereas Armus and Hulse 
gave spaced extinction. The present 
experiment is a test of the hypothesis 
that reward magnitude and spacing of 
extinction trials will interact within a 
single experiment. If confirmed, this 
relationship would be of considerable 
significance for the interpretation of 
extinction. 


1 This research was supported by Grant 
G-8706 from the National 
dation. 
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METHOD 


The Ss were 64 experimentally naive 
female albino rats of the Sprague-Dawley 
strain, 74 days old at the beginning of train- 
ing. Training and extinction took place in an 
enclosed runway previously described by 
Lewis (1956). 

Each S received six daily 3-min. sessions 
of prehandiing, the last session 48 hr. prior 
to the beginning of experimental training. 
During each session S was allowed to explore 
a large unpainted wooden box, presented with 
four of the pellets later to serve as reward, 
and picked up and replaced at least five times 
by E. A once-daily feeding schedule began 
on the first day of prehandling and was 
maintained throughout experimental training. 
The ration was 10 gm. of finely ground Purina 
lab chow and was presented 50 to 60 min. 
after the start of prehandling or experimental 
training. 

All Ss received 25 trials of acquisition, 
5 per day, and 20 trials of extinction be- 
ginning on the sixth day. During both 
acquisition and extinction, S was confined in 
the goal box for a minimum of 15 sec. or 
until all pellets were consumed (maximum of 
4 min.). Between trials on the same day, S 
was confined in its home cage, with water 
available, for 20 sec. uring extinction, 
the food cup was removed ‘rom the goal box. 

Differential training was introduced by 
way of a 2 X 2 factorial design, varying the 
number of .045-gm. Noyes pellets given as 
reward during acquisition (four pellets or one) 
and the intertrial interval during extinction 
(20 sec. or 24 hr.). Thus the four experi- 
mental groups of 16 Ss each may be desig- 
nated according to extinction spacing or 
massing and according to acquisition mag- 
nitude as Sp-4, Sp-1, M-4, and M-1. 


RESULTS AND DISCUSSION 


Acquisition —Curves of acquisition 
speed are shown in Fig. 1. Double 
classification analysis of variance on 
the mean speeds for the last five trials 
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Mean speeds in acquisition (five trials a day) for two magnitudes 


of reward and for Ss subsequently receiving massed or spaced extinction 


confirms the superiority of the four- 
pellet condition (F = 18.90, df = 1/60, 
P < .001). The dummy distribution 
variable and the interaction are both 
nonsignificant (Fs = 2.91 and 3.44, 
respectively), indicating that the two 
spacing groups were roughly equiva- 
lent before thes pacing variable was 
introduced. 

The curves show a tendency for the 
greatest increases in speed to come 
between the end of 1 day and the 
beginning of the next. This reminis- 
cence was more marked in the later 
stages of learning and in the one- 
pellet groups, combined under these 
conditions with a marked within-days 
decrement in speed. To quantify this 
reminiscence effect, a score was com- 
puted for each S on each day by sub- 
tracting the main gain in speed be- 
tween each trial and the next from the 
gail in speed between the last trial of 
the previous day and the first trial 


of the day in question. When this 
score is averaged over the 4 days of 
acquisition (excluding the first day, 
for which it cannot be computed), the 
mean is significantly positive at the 
.001 level for both the four-pellet 
(Sp-4 plus M-4) and the one-pellet 
(Sp-1 plus M-1) groups (t's = 4.85 
and 6.75, respectively, for the differ- 
ence from zero). This indicates that 
the trial-to-trial gain was greater over 
the 1-day interval than over the 20- 
sec. interval. A trend analysis showed 
the overall mean to be significantly 
higher at the .05 level for the one- 
pellet than for the four-pellet group 
(F = 3.98, df = 1/62). The increase 
over trials yielded a significant F of 
3.98 (df = 3/186, P = .01) but one 
which is not quite significant for the 
1 and 62 df recommended as con- 
servative by Geisser and Greenhouse 
(1958). The F for Group X Trend 
interaction was less than 1. 
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[exlinclion. The course of extinc- 


tion is shown in Fig. 2. It is evident 
that larger reward and spaced practice 
resulted in greater resistance to ex- 
tinction, the latter in spite of the 
(nonsignificant) superiority the 
to-be-massed group in acquisition. 
The statistical reliability of these 
findings is confirmed by analysis of 
variance of mean speeds on Trials 2-6 
and Trials 16-20, with 1 and 60 df 
for all F ratios. In the analysis of 
early extinction, magnitude and spac- 
ing were both significant at the .05 
level (Fs = 4.71 and 5.18, respect- 
ively), with an F for interaction less 
than 1. In the analysis of late ex- 
tinction, magnitude was significant at 
the .05 level (F = 6.68), distribution 


of 


HILL AND NORMAN E 


SPEAR 


at the .0O1 level (/ 26.28), and 
interaction at the .01 level (/ = 7.45). 
The interaction reflects the conver- 
gence of the two magnitude curves in 
the massed but not in the spaced 
condition. 


Discussion.—The main hypothesis of 
the experiment was that reward magni- 
tude has opposite effects on extinction 
depending on the spacing of trials during 
extinction. This prediction was clearly 
not confirmed. Larger reward gave 
greater resistance to extinction with both 
massed and spaced extinction, and the 
interaction of the two variables late in 
extinction was in the opposite direction 
from what was predicted. The present 
results thus confirm Metzger, Cotton, 
and Lewis (1957) and Zeaman (1949), 
well several studies of reward 


as as 
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RESISTANCE 


magnitude using concentration of sucrost 
in the Skinner box (e.g., Collier & Willis, 
1961; 1953). They not, 
however, explain the contradictory re- 
sults of Armus (1959) and Hulse (1958). 

Che reminiscence effect in acquisition 
It is possible that this 


(;uttman, do 


was unexpected. 
effect and the greater resistance to ex- 
tinction in the distributed group 
both be due to the mechanism. 


may 
same 


This mechanism might be either reactive 
inhibition (Hull, 1951) built up during 
massed practice or, alternatively, activity 


drive (Hill, 1956) built up during rest in 


small cages and satiated by massed 


prac tice. 
SUMMARY 


Rats received 25 trials of acquisition and 
20 trials of extinction in a straight alley, with 
reward magnitude (four pellets or one) and 
intertrial interval in 
24 hr.) factorially. 
extinction was greater for large reward and 
for spaced extinction, without the interaction 
earlier 


extinction (20 sec. or 


varied Resistance to 


from a comparison of 


Marked reminiscence was observed 


predicted 
studies 
from day to day in acquisition. 
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HIERARCHIES IN CONCEPT 


ULRIC NEISSER anp PAUL 


ATTAINMENT 


WEENE! 


Brandeis University 


In the laboratory or out of it, new 
ideas are always built on old ones. 
To attain a typical experimental con- 
cept, say “three borders,’ S must 
already be able to identify borders, 
to count, to distinguish between E’s 
positive and negative statements, and 
so on. Much cognitive activity is 
hierarchically organized, in that the 
abstractions at one level form the 
basis of new abstractions at the next. 
The present experiment is an attempt 
to study hierarchical concepts ex- 
plicitly. Although only binary con- 
cepts were used (more than two 
features were never relevant), the 
range of possibilities included three 
degrees of hierarchical depth. 


There are many ways in which two 


or more features of a stimulus pattern 
may be combined into an attribute of 


higher order. For example, they may 
be conjoined: the attribute is defined 
by the joint presence of several 
features. A certain object is ‘‘of good 
quality” if it has been made skillfully 
(A), and of first-class materials (B). 
Neither feature alone is sufficient; 
both together are decisive. Bruner, 
Goodnow, and Austin (1956) worked 
extensively with conjunctive prop- 
erties, but studied disjunctive attri- 
butes as well. In a disjunction, the 
presence of either property (or of 
both) is sufficient to define the 
concept. A patient may have an 
allergic reaction to either strawberries 
(A) or tomatoes (B). In conjunctive 


1The experiment was performed while 
both authors were staff members of Lincoln 
Laboratory, Massachusetts Institute of Tech- 
nology, operated with support from the 
United States Army, Navy, and Air Force. 
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concepts, the criterial attribute may 
be written symbolically as “‘A-B,” 
while the corresponding notation for 
disjunctive attributes is ‘AvB.” 
These two cases do not exhaust the 
possibilities, even if nothing matters 
but the presence or absence of two 
distinguishing features. There are 10 
types of criterial attributes which can 
be based on one or on two features. 
They are listed in Table 1. Attributes 
based on more complex relations (‘A 
followed by B,” “A within B,”’ 
on) will not be considered here. 

The 10 types of bivariate attributes 
fall naturally into three levels, as 
indicated in Table 1. The univariate 
attributes are evidently the simplest. 
Next are a group of six bivariate 
attributes, made up directly from the 
univariate ones by negating, con- 
joining, or disjoining them. Finally, 
the two most complex attributes are 
formed by disjoining certain con- 
junctive pairs. Successive levels rep- 
resent increasing complexity, not only 
in terms of the number of symbols 
needed to define the attributes, but in 
terms of a _ hierarchical structure. 
That is, attributes of Level II are 
combinations of those at Level I, and 
are components of those at Level III. 
It must be understood that this 
ordering arises only because we have 
taken negation, conjunction, and dis- 
junction (rather than, say, double 
implication) as the basic operations, 
to be represented by 
symbols. The hierarchy is merely a 
tautology until it is related to em- 
pirical findings like those presented 
here. In a sense, the findings of the 
present experiment support the selec- 


and so 


elementary 
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TABLE 1 


Types OF ATTRIBUTES WHICH CAN BE DEFINED BY PRESENCE 
OR ABSENCE OF TWO FEATURES 


Name and Symbolic 
Designation 
Level I 
Presence (A) 
Absence ( —A) 


A must be present 


ment of presence) 


Level II 
Conjunction (A-B) 


Disjunction (AvB) 
present 


Exclusion (A-—B) A 


must be 
present 


present 


Disjunctive ab- 
sence (—Av—B) 
tion) 
Conjunctive ab A and B 
sence ( —A- . B) 


must 


Implication 
(—AvB 


clusion) 


Level III 
Either /or 
not both together 
Both/neither 


(A-B)v 
(—A-—B) 


either /or) 


tion of these three 
primitive. 

The 10 types of attributes fall into 
five complementary pairs. Anything 
that is a positive instance of one 
member of such a pair (i.e., which has 
its attribute) is a negative instance of 
the other. For example, A-B is the 
complement of —Av—B because all 
and only those objects which are 
described by the former expression 
are not covered by the latter. Sym- 
bolically, one may find the com- 
plement of an expression by changing 
every “-” 


operations as 


to a ‘“‘v’’ (and vice versa) 
and also every plus to a minus (and 
vice versa). 

The experiment reported here is a 





Description of Positive Instance 


Example 





A must not be present (comple- 
Both A and B must be present 


Either A or B or both must be 
and B not 


Either A or B, or both, must be ab- 
sent (complement of conjunc- 


be absent 
(complement of disjunction) 


both 


A may be absent, but if A is pres- 
ent then B must be also; thus A 
implies B (complement of ex- 


Either A or B must be present, but 


Both A and B must be present, un- 
less neither is (complement of 


Vertebrate: must have a backbone 
Invertebrate: must not have a 
backbone 





Good quality: both material and 
workmanship must be first class 

Allergenic: a food which contains 
either tomatoes or strawberries 
(for example) 

Eligible for Driver's license: must 
have passed test and not have 
committed felony 

Poor quality: either material or 
workmanship is not first class 


Nonallergenic: a food which con- 
tains neither tomatoes nor 
strawberries (for example) 

Ineligible for driver's license: must 
either have not passed test or 
have committed felony 


Negative product: either factor 
negative, but not both 


Positive product : both factors may 
be negative, or neither, but not 
just one 


study of the relative difficulty of 
attaining concepts at these several 
levels. The underlying hypothesis 
was that concepts at hierarchically 
higher levels would be more difficult 
to attain than those of lower levels. 


METHOD 


Experimental materials.—The stimulus ob- 
jects were strings of four consonants, each 
string printed on a 4 X Gin. filing card. Only 
J, Q, V, X, and Z were used. Thus there were 
625 distinguishable stimuli altogether (JJ JJ, 
JJJO, JV, ---, OJOZ, VOZX, > 
ZZZZ). The concepts were defined in terms 
of the presence or absence of one or of two 
of these letters. The order and frequency of 
the letters in the string was never relevant, 
so that (for example) QQVZ was always 
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equivalent to VZZQ, ZVQV, and to any other 
string which contained Q, Z, and V but did 
not contain either J or X. Each of the types 
of criterial attribute represented in Table 1 
could be realized in a number of specific ways. 
For example, JvX, QvZ, etc., are all dis- 
junctions. It can easily be verified that 
altogether 110 different univariate and bi- 
variate attributes can be defined on these 
stimuli. Any given stimulus is a positive 
instance of 55 of these and negative instance 
of the other 55. For example, QQVZ is a 
positive instance of Q, of —X, of VvJ, of 
(Q-V)v(—Q-—V), etc., and a _ negative 
instance of —Q, of X, of —V--—J, of 
(Q-—V)v(—Q-V), ete. 

The Ss were 20 students of 
college age. They worked for about 3 hr. 
every morning, in groups of 5. A group of 
practiced Ss could complete about four 
problems in such a session. 

A pparatus.—The sequence of stimuli for a 
given concept was arranged as a deck of cards 
and set in a wooden frame, with the front card 
concealed by a spring-loaded shutter. To 
present a stimulus, EZ released the shutter. 
Between trials, he closed the shutter and 
removed the front card. 

Procedure.—When the stimulus appeared, 
each S, working independently, responded 
“plus” if he thought it was a positive instance 
of the attribute he to discover, and 
“minus” if he thought not. Responses were 
made by means of toggle switchés which con- 
trolled appropriate indicators on a panel 
visible only to E. When all Ss had responded, 
E noted the response ; 


Subjects. 


was 


informed the Ss of 
the correct answer, and then presented the 
next stimulus. No attempt to time the pres- 
entations was made, but an S who hesitated 
more than about 15 sec. was asked to guess 
ather than delay further. 

All sequences of stimuli were arranged to 
make positive and negative instances equally 
probable, and successive stimuli independent. 
(Appropriate sequences were prepared with an 
IBM 709 computer.) Since pure guessing 
would yield 50% correct responses, S was 
judged to have attained a concept when he 
had made 25 consecutive responses with only 
a single error. (The possibility of carelessness 
made a 100% criterion inadvisable.) Ordi- 
narily, a single problem was continued for 100 
stimuli or until all Ss had reached criterion. 
The situation was kept as noncompetitive as 
possible. The group was not informed about 
the performance of any individual, and each S 
responded on every trial whether or not he 
had reached criterion. 
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Before the experiment, Ss were told about 
the kinds of attributes that would be criterial. 
They were instructed that only the presence 
or absence of particular letters mattered, and 
that not more than two letters would be 
relevant. It was stressed that sequence and 
possible reduplication of letters on the cards 
was irrelevant. It was made clear that the 
absence of a letter, or of two letters, could be 
as important as its presence, and that the 
absence of one could be systematically con- 
nected with the presence of another. 

Experimental design.—The first three prob- 
lems (V, XvJ, Q: —Z) were the same for each 
group, and were considered practice. Ex- 
planation by E of potentially relevant and 
irrelevant attributes continued during these 
problems. Thereafter, each group of 5 Ss 
was given two consecutive cycles through 
the 10 types of problems described in Table 1. 
The order of problems within each cycle was 
varied from group to group, as were the 
letters which exemplified each type of concept ; 
conjunction, for example, might be repre- 
sented by J-X, Q-V, Z-J, etc. Thus each S 
was presented with 23 concept attainment 
problems. 

For two of the groups, a “nonresponding” 
cycle through the 10 types was interpolated 
between the three practice trials and the first 
of the cycles mentioned above. The Ss were 
shown 100 positive instances of each concept, 
and then asked to write a description of it. 
Since these groups did not differ appreciably 
from the others in their performance on the 
concept-formation cycles, the data have been 
combined for this paper. The results of the 
nonresponding cycle (and of other such cycles 
carried out at the conclusion of the main 
experiment) were too ambiguous to merit 
description here. 


RESULTS 


Table 2 exhibits the median trials 
needed to reach criterion on each type 
of problem, considering the two cycles 


separately. (Means cannot be given, 
because some Ss failed to attain the 
criterion on some problems.) The 
results support the hypothesis that 
three distinct levels of difficulty are 
represented. Problems of Level II 
are systematically harder than those 
of Level I and easier than those of 
Level III. There is also a substantial 
practice effect: in 8 of 10 cases the 
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TABLE 2 


PRIALS 10 CRITERION FOR DIFFERENT Tyres or PROBLEMS 


Cycle 1 


Type of Concept 


Level I 
Presence (A 


Absence ( — A) 


Level II 
Conjunction (A-B 
Disjunction (AvB 
Exclusion (A- —B 
Disjunctive absence -Av-B 
Conjunctive absence (—A-—B) 
Implication (—AvB) 


Level III 
Either/or (A- —B)v(—A-B 
Both/neither (A-B)v( —A-—B) 


Note.—0Q1:/Qs; indicates the first and third quartiles; N = 20 throughout. 


- 
juartile S did not attain criterion. The 25 criterion trials are not included 


TABLE 3 
PROPORTIONS OF SS FOR WHOM ONE Concept Was East! 


PHAN ANOTHER: ALL CONCEPT PAIRS 


Level II 


14/19 

15/19* 
14/17* 
16/19* 


11 
15 
12 
17 


Aaa 


19 
19* 
18 
19* 





11/20 
10/20 


mmr ouwnw a+ Vv 


= 


AvB 


= 








II] 
\--—B)v 16 
(—A-B 17 


Note Each numerator is the number of 


concept of that column; the denominator is the number available for the comparison. 
lower fractions for Cycle 2 
7: P 


05 ; two-tailed binomial test 


Ss who attained the concept of that row more 


ckly tha 
Upper fractions for Cy« 
Comparisons involving two different levels are above and to the left of the heavy 


2 OM 
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median for the second cycle is below 
that for the first. 

In Table 3, every type of concept 
is explicitly compared with every 
other type. The comparisons, made 
separately for the two cycles, are in 
terms of the proportion of Ss who 
found one type easier than the other. 
Most proportions are based on slightly 
fewer than 20 Ss, since those who 
found the two problems equally 
difficult, or solved neither, are not 
counted. For each comparison, the 
null hypothesis is that the two con- 
cepts are equally difficult, and that 
the tabulated proportion differs from 
} only by chance. In all those cases 
where the comparison is between 
concepts of different levels, we have 
the counterhypothesis that an S is 
more likely to find the lower-level 
hypothesis easier. Table 3 is so 
arranged that the counterhypothesis 
is supported by proportions above 3, 
and not by those below 4. It is also 
arranged so that all cases to which the 


counterhypothesis applies (i.e., com- 
parisons between concepts at different 
levels) fall above and to the left of the 
heavy line. It appears that all but 
1 of the 56 interlevel comparisons 


are in the predicted direction. More- 
over, 39 of these proportions are 
significantly different from } when 
considered individually. It is evident 
that levels of complexity play an 
important role in determining the 
difficulty of concept attainment. 

No prediction was made about the 
relative difficulty of concepts within 
a single level. Indeed, Table 3 shows 
proportions near } for most such com- 
parisons. But implication (—AvB) 
and disjunctive absence (—Av—B) 
are significantly more difficult than 
the other second-level concepts on the 
first cycle of problems. The probable 
explanation is that Ss did not fully 
understand the definition of these 
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concepts at first. On the second cycle 
this obstacle had been overcome by 
familiarity, and these concepts lost 
their special status. There is one 
other anomalous finding: —A was 
easier than A. This result is difficult 
to understand, since these types differ 
only in which half of the universe of 
stimuli is called ‘‘plus.”’ 


DISCUSSION 


Why are higher-level concepts more 
difficult to attain? It might be supposed 
that, for complex combinatorial reasons, 
an unusually large number of stimuli is 
needed for logical elimination of com- 
peting hypotheses when a _ high-level 
attribute is the criterial one. We ex- 
plored this possibility by writing a com- 
puter program (for the IBM 709) which 
solves our problems by rote. It has a 
list of the 110 possible concepts, and 
checks off those which are eliminated by 
each stimulus as it appears until only one 
concept remains. On the average, this 
program needs from 8 to 12 instances to 
pinpoint the defining attribute, although 
it may occasionally take much longer 
(if the string of stimuli happens to be 
unusually redundant). Paradoxically, 
the program takes slightly longer to 
identify the simple attributes (A and 
—A) than those of Level II, while the 
concepts of Level III take the fewest 
trials of ali! The reason seems to be that 
when a series of stimuli are all com- 
patible with a simple attribute such as 
“Z,"’ there is a relatively high probability 
that they will all be compatible with 
certain high-level disjunctions, such as 
ZvQ, as well. 

We wish to emphasize that the com- 
puter program was not written to 
simulate the behavior of human Ss, but 
simply to establish the rates at which the 
different concepts could be attained by 
logical elimination. The discovery that 
human Ss do not attain concepts in this 
way is hardly surprising. 

A second explanation of the difficulty 
of attaining high-level concepts might 
appeal to the difficulty of formulating 
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them verbally. Perhaps Ss find them 
unfamiliar, or cannot easily keep them in 
mind. The unexpected results with im- 
plication and disjunctive absence suggest 
that validity to 
interpretation. It is not fully adequate, 
The Ss seemed to have a 
better verbal understanding of either/or 
than of most of the concepts at Level II 


which were more quickly attained. 


there is some this 


however. 


In our opinion, higher-level concepts 
are more difficult because of their hier- 
archical organization. To identify an 
instance of (Z-Q)v(—Z-—Q) one must 
Z-Q and —Z-—Q available as 
After all, any individual 
instance of the first 


have 
components 
is also an 
instance of one of the latter two. More- 
over, to work with Z-Q, S must know a 
Z and a QO when he sees one. Thus the 
levels into have divided the 


concept 


which we 


possible binary concepts may correspond 


to actual levels of input analysis by Ss. 
To attain a complex concept, they must 
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use, and therefore must have attained, 
preliminary concepts at lower levels. 


SUMMARY 


Twenty Ss were employed in a study of the 
relative difficulty of attaining 10 different 
types of concepts. All types involved only 
the presence or absence of two properties, but 
some were hierarchically more complex than 
others. For example, “Both A and B” is 
more complex than “A” but less complex than 
“Both A and B or neither.” The results 
indicate that the difficulty of a concept varies 
directly with its complexity. This order of 
difficulty does not appear when a computer 
program is used to attain the concepts by 
simple elimination. It seems to reflect a 
hierarchical organization of conceptual proc- 
esses in the Ss themselves. 
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REPLICATION REPORT: LATENT LEARNING IN A T MAZE 


AFTER SHOCK 


IN ONE END BOX 


HENRY GLEITMAN anp MAGDALENA M. HERMAN 


Swarthmore College 


Tolman and Gleitman (1949) have re- 
ported latent learning in a T maze with highly 
differentiated end boxes. They found ap- 
propriate choice behavior after rats were 
shocked in one or the other of the two end 
boxes, following an equal number of reinforce- 
ments on both sides. 

Method.—The original experiment was 
replicated in all respects but the following: 
(a) Guillotine doors were used instead of the 
one-way doors used in the original experi- 
ment, (6) all Ss were run under 24 hr. of 
food deprivation, and (c) the location of the 
two differentiated end boxes was systemat- 
ically varied, the dark one being on the right 
side for half the Ss, and on the left for the 
other half. Finally, since there was some 
possibility that the positive results of the 
first experiment might be due to the distribu- 
tion of trials (only two trials per day, one free 
and the other forced), two conditions of dis- 
tribution were employed. 

The Ss experimentally naive 
female rats of white Angora strain, approxi- 
mately 100 days old at the beginni of the 
experiment. All Ss were reduced to 90% of 
their original body weight and kept at that 
level throughout the experiment. After 16 
trials of pretraining on a straight runway, 
they were divided into two equal groups 


were 38 


roughly matched on running times during pre- 


training. Group I received two trials per day 
on the apparatus over 10 days, the first trial 
free and the second forced. Group II received 
four trials per day over 5 days, the first and 
third being free and the others forced. One 
day following their last training trial, Ss were 
placed in one of the two end boxes to find 
then into the other to receive two 
periods of intermittent shock. As in the 


food, 


original experiment, the spatial location of the 
end boxes was markedly different during this 
phase of the experiment as compared to 
training. Again as in the original experiment, 
half of the Ss were shocked in the preferred, 
the other half in the nonpreferred end box. 
The Ss were tested in the original apparatus, 
about 1 hr. after they had been shocked. 
Results.—Fourteen out of 19 Ss in Group I, 
and 13 out of 19 Ss in Group II, chose the 
side away from that on which they had been 
shocked. It is thus apparent that at least 
for this limited range of values there was no 
effect of distribution of practice on the final 
choice. Since the two groups were virtually 
identical in their choice behavior, their 
results were combined and tested for sta- 
tistical significance. Chance selection of the 
harmless side could be ruled out at the 1% 
level of significance (CR = 2.60, P < .01). 
While the major finding of the original 
study was substantiated in the present experi- 
ment, there is some difference in the magni- 
tude of the effects. In the present study, the 
harmiess side was chosen by 71% of the Ss, 
in the original experiment by 88%. This 
difference may be due to rather strong 
turning or place preferences developed in the 
course of the present experiment, which some- 
times were strong enough to override other 
factors. Of the 11 Ss who did not choose the 
harmless side of the final test, 9 were Ss 
who had been shocked in the preferred end box. 
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SUPPLEMENTARY REPORT: THE WEINSTOCK PARTIAL 
REINFORCEMENT EFFECT AND HABIT REVERSAL 
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Sheffield (1949) found the partial rein- trial procedures were essentially the same as 
forcement effect (PRE) for massed acquisition in Grosslight and Radlow’s experiment. All 
(15-sec. intertrial interval) but not for Ss were given 40 acquisition trials and 40 
distributed acquisition (15-min. intertrial habit reversal trials. 
interval). Weinstock (1954, 1958) found the Results and discussion. 


Figure 1 shows 
PRE under widely spaced trials (24-hr. 


the mean number of correct responses for all 
intertrial interval). Both of the investigators groups for both acquisition and reversal. An 
used a simple running response. Wike (1953) analysis of covariance for the first five trials 
and Grosslight and Radlow (1954) found the in massed habit reversal shows statistically 
PRE for massed acquisition in a habit significant differences among the 100%, 70%, 
reversal discrimination problem The pur- and 40% groups (F = 19.08; P < .01) with 
pose of the present experiment was to de- the 100% group showing the least resistance 
termine whether or not the PRE would be _ to extinction (fastest reversal). Additional 
present in a habit reversal discrimination analyses of covariance at successive five trial 
problem with a 24-hr. intertrial interval. intervals continue to 

Method.—A 2 X3 factorial design was significant differences in the same direction. 
used incorporating 100%, 70%, and 40% This is in agreement with Sheffield’s findings 
reinforcement, and 20-sec. and 24-hr. inter- for massed acquisition An analysis of 
trial intervals. The Ss were 60 experimentally covariance for the first five trials of dis- 
naive male albino rats. A Y alley discrimi- tributed habit reversal shows no statistically 
nation apparatus was employed. Stimuliand _ significant differences 


show statistically 
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groups. However, a similar analysis con- 
ducted on the second five trials shows 
statistically significant differences (F = 3.93; 
P < .05) with the 100% group showing the 
least resistance to extinction. Subsequent 
analyses conducted at successive five-trial 
intervals showed even greater statistical 
significance. This finding does not agree with 
Sheffield’s results for distributed acquisition. 
It does, however, substantiate the findings of 
Weinstock. 

The present experiment can be added to a 
growing body of studies denying the Shef- 
field aftereffects hypothesis. There seems to 
be little doubt now but that PREs can be 
obtained under both massed and distributed 
conditions and must be accounted for by any 
theory attempting to explain PREs. Whether 
or not Weinstock’s habituation hypothesis is 
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the correct interpretation the writer cannot 
say, but the present data are in agreement 
with it. 
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THE UTILITY OF CORRECTLY 


PREDICTING INFREQUENT EVENTS 
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Brackbill, Kappy, and Starr (1962) found 
that maximum gain responding increased 
with increasing amounts of reward for correct 
prediction. The authors’ expectation that a 
first-order sequence analysis of their data 
would show previous actual occurrence, rather 
than previous prediction, to be the only 
reliable predictor from Trial m — 1 to Trial n, 
was not confirmed for m — 1 trials on which 
the less frequent event actually occurred. 
Maximum gain responding more often fol- 
lowed success in predicting the less frequent 
event than lack of success in predicting it. 
This effect suggested a second, independent 
source of reinforcement—the utility to S of 
correctly predicting the occurrence of the less 
frequent event. Whatever the interpretation, 
the sequence analysis findings are not directly 
predictable from reinforcement theory nor 
from current theories of probability learning 
(cf. Suppes & Atkinson, 1960). It seemed 
advisable, therefore, to find out whether these 
results were reproducible and whether their 
occurrence was limited to the particular 
values of the experimental parameters of the 
original study. 

Method.—First-order sequence analyses 
were performed on 12 independent sets of 


ANTHONY BRAVOS 


Johns Hopkins U niversity 


noncontingent probability learning data pre- 


viously collected. These data were obtained 
under the same experimental conditions as 
those of the Brackbill, Kappy, and Starr (1962) 
study except for variation of the following 
parameters: amount of tangible reward given 
for a correct prediction; number of stimulus 
events; relative frequency of occurrence of 
the stimulus events; number of Ss; S’s age 
and grade in school; and number and series 
position of the asymptotic trials within each 
sequence analysis. Table 1 shows the value 
used for each of these parameters for each 
of the 12 groups of the present study as well 
as the four groups of the original experiment 
(Rows 2-5). In Table 1, the letters M and L 
stand for the more (or most) and less (or 
least) frequent events. Under ‘“‘tangible 
reward,” 1 M or L shows that one unit of 
reward was given for a correct prediction of 
either event, and 1 M: 4 L shows that one 
unit of reward was given for a correct pre- 
diction of the more frequent event, four units 
for a correct prediction of the less frequent 
event. A unit of reward was 1 marble for the 
younger Ss and 1 point for the older Ss; 
100 marbles were exchanged for one toy, and 


100 points for $1.00. In the last five rows of 
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that Table 1 include the 
sequence analysis results for the stimulus 
events of intermediate frequency under the 
three-stimulus conditions. 

Results and discussion.—The last four 
columns of Table 1 show the mean prob- 
abilities of predicting Event M on Triai » 
given the prediction (p) and actual occurrence 
Trial m — 1. Thus, for example, the 
entry in the upper right-hand cell indicates 
that, for those instances in which Ss had 
predicted the less frequent event (Lp) on 
Trial n — 1, and the less frequent event had 
actually occurred (L,) on Trial x — 1, the 
mean probability of predicting the more 
frequent event (M) on Trial m was .68. 

The question under investigation is 
whether S’s prediction on Trial m is deter- 
mined by the nature of his previous prediction 
as well as by the previous actual occurrence 
or reinforcement. Therefore, it is appropriate 
to compare the M,M, to the L,M, prob- 
abilities and the L,L, to the M,L. prob- 
abilities. For the present data, shown in 
Rows 1 and 6-16 of Table 1, the mean value 
of MyM. exceeds that of L,M., in 6 cases out of 
12, while the mean value of L,L, exceeds that 
of M,L, in 10 cases out of 12 (P = .04, by 
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indicate does not 


(o) on 


400 
400 


binomial expansion). For all 16 sets of data, 
the mean value of L,L, exceeds that of M,L. 
in 14 cases (P = .004). 

In spite of wide variations within several 
experimental parameters, the same result 
has emerged as before. In order to maximize 
prediction to Trial m from preceding trials on 
which the less or least frequent event occurred, 
t sary to consider S’s previous pre- 

tion to the previous 
he direction of the 


actual 
effect 
in the 


pre 


supports the original 


interpretation: that there is a relatively 


predicting the 
irequent 


greater utility to S of correct 
occurrence of the less r lea 
event. It would be interesting 
same phenomenon might occur gener 

any type of learning situation in which 
finding E’s “game” tedious and uninteresting, 
can and does invent one of his own. 
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Peterson and Peterson (1959) report clear- 
cut evidence of a progressive improvement in 
recall scores with an increase in the number 
of repetitions of the material by S before’ the 
delay of recall began. Certain anomalies in 
their data suggested the obtained differences 
might have resulted from E unintentionally 
interfering with S’s response pattern. The 
present experiment repeated the Petersons’ 
work using visually presented material to 
reduce the likelihood of inadvertent in- 
terference. 

Method.—The verbal items were three- 
consonant units with a Witmer association 
value no greater than 33%. The material 
used to keep Ss active during the recall delay 
interval consisted of groups of three randomly 
selected digits. 

One three-consonant unit and a series of 
digit groups were typed as a list on a memory 
drum tape. There were eight such lists on a 


tape and a pair of tapes constituted a set of 
all 16 experimental conditions in random 
order, i.e., one, two, four, and eight repetitions 
of the three-consonant unit and recall delay 


intervals of 3, 9, 18, and 27 sec. 
were displayed at a rate of 1/sec. 
Five seconds after E started the memory 
drum, a green star appeared in the window 
as a warning that the consonant unit was 
about to appear. The S was instructed to 
read aloud what appeared in the window and 
not to anticipate what might appear next. 
This was done to decrease the possibility 
that S was preparing for another rehearsal as 
the three digits of the intervening activity 
appeared. When the entire list had been 
presented, a red star appeared as a signal to 
recall the consonants presented at the start of 
the list. The intervening activity consisted 
in reading groups of three digits. After recall 
of the consonants was completed, S was 
required to make two judgments about these 
numbers, estimates as to which digit had 
appeared least frequently and which most 
frequently. This done to make the 
number task a more meaningful part of the 
experiment. On each of the 5 


The stimuli 


was 


days, a 


1 Defence Research Medical Laboratories Project No 
246, DRML Report No. 246-15, PCC No. D77-94-20-46 
H. R. No. 219 


TABLE 1 
PROPORTIONS OF ITEMS CORRECTLY RECALLED 


Number 
of Pres- 
entations | 


different pair of lists was presented in a 
random order to each S. The 25 paid Ss 
were housewives. 

Results and discussion.—Following Peter- 
son and Peterson (1959), an item was con- 
sidered to be correctly recalled only if every 
consonant was correct and in its proper 
position. Table 1 records the mean propor- 
tion of items recalled correctly for 25 Ss on 
5 days. 

These data confirm the Petersons’ con- 
clusions that there is better recall with an 
increase in the number of stimulus repetitions 
and with shorter periods of delay before 
recall. 

An analysis of variance showed that number 
of presentations and recall delay interval are 
both significant (P < .01). The only signifi- 
cant interaction was recall delay with number 
of presentations. This arises because the 
effect of an increase in recall delay was more 
pronounced on the trials where the consonant 
groups were presented once or twice than 
when they were presented four or eight times. 

The Ss in the present experiment obtained 
markedly higher recall scores than those 
reported by Peterson and Peterson, perhaps 
because in the present study the stimuli were 
presented both visually and aurally.  In- 
spection of the present data, confirmed by an 
analysis of variance, yields no evidence for 
learning over blocks of trials. 
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